Elements of statistical disclosure control

書誌事項

Elements of statistical disclosure control

Leon Willenborg, Ton de Waal

(Lecture notes in statistics, 155)

Springer, c2001

大学図書館所蔵 件 / 41

この図書・雑誌をさがす

注記

Includes bibliographical references (p. [245]-254) and index

内容説明・目次

内容説明

Statistical disclosure control is the discipline that deals with producing statistical data that are safe enough to be released to external researchers. This book concentrates on the methodology of the area. It deals with both microdata (individual data) and tabular (aggregated) data. The book attempts to develop the theory from what can be called the paradigm of statistical confidentiality: to modify unsafe data in such a way that safe (enough) data emerge, with minimum information loss. This book discusses what safe data, are, how information loss can be measured, and how to modify the data in a (near) optimal way. Once it has been decided how to measure safety and information loss, the production of safe data from unsafe data is often a matter of solving an optimization problem. Several such problems are discussed in the book, and most of them turn out to be hard problems that can be solved only approximately. The authors present new results that have not been published before. The book is not a description of an area that is closed, but, on the contrary, one that still has many spots awaiting to be more fully explored. Some of these are indicated in the book. The book will be useful for official, social and medical statisticians and others who are involved in releasing personal or business data for statistical use. Operations researchers may be interested in the optimization problems involved, particularly for the challenges they present. Leon Willenborg has worked at the Department of Statistical Methods at Statistics Netherlands since 1983, first as a researcher and since 1989 as a senior researcher. Since 1989 his main field of research and consultancy has been statistical disclosure control. From 1996-1998 he was the project coordinator of the EU co-funded SDC project.

目次

1 Overview of the Area.- 1.1 Introduction.- 1.2 Types of Variables.- 1.2.1 Categorical variable.- 1.2.2 Hierarchical variable.- 1.2.3 Continuous/Numerical/Quantitative Variable.- 1.2.4 Identifying Variable.- 1.2.5 Sensitive Variable.- 1.2.6 Weight Variable.- 1.2.7 Regional Variable.- 1.2.8 Household Variable.- 1.2.9 Spanning Variable and Response Variable.- 1.2.10 Shadow Variable.- 1.3 Types of Microdata.- 1.3.1 Simple Microdata.- 1.3.2 Complex Microdata.- 1.4 Types of Tabular Data.- 1.4.1 Single Tables.- 1.4.2 Marginal Tables.- 1.4.3 Hierarchical Tables.- 1.4.4 Linked Tables.- 1.4.5 Semi-linked Tables.- 1.4.6 Complex Tables.- 1.4.7 Tables from Hierarchical Microdata.- 1.5 Introduction to SDC for Microdata and Tables.- 1.6 Intruders and Disclosure Scenarios.- 1.7 Information Loss.- 1.7.1 Information Loss for Microdata.- 1.7.2 Information Loss for Tables.- 1.8 Disclosure Protection Techniques for Microdata.- 1.8.1 Local Recoding.- 1.8.2 Global Recoding.- 1.8.3 Local Suppression.- 1.8.4 Local Suppression with Imputation.- 1.8.5 Synthetic Microdata and Multiple Imputation.- 1.8.6 Subsampling.- 1.8.7 Adding Noise.- 1.8.8 Rounding.- 1.8.9 Microaggregation.- 1.8.10 PRAM.- 1.8.11 Data Swapping.- 1.9 Disclosure Protection Techniques for Tables.- 1.9.1 Table Redesign.- 1.9.2 Cell Suppression.- 1.9.3 Adding Noise.- 1.9.4 Rounding.- 1.9.5 Source Data Perturbation.- 2 Disclosure Risks for Microdata.- 2.1 Introduction.- 2.2 Microdata.- 2.3 Disclosure Scenario.- 2.4 Predictive Disclosure.- 2.5 Re-identification Risk.- 2.6 Risk Per Record and Overall Risk.- 2.7 Population Uniqueness and Unsafe Combinations.- 2.8 Modeling Risks with Discrete Key Variables.- 2.8.1 Direct Approach.- 2.8.2 Model Based Approach.- 2.9 Disclosure Scenarios in Practice.- 2.9.1 Researcher Scenario.- 2.9.2 Hacker Scenario.- 2.10 Combinations to Check.- 2.10.1 A Priori Specified Combinations.- 2.10.2 Data Driven Combinations: Fingerprinting.- 2.11 Practical Safety Criteria for Perturbative Techniques.- 3 Data Analytic Impact of SDC Techniques on Microdata.- 3.1 Introduction.- 3.2 The Variance Impact of SDC Procedures.- 3.3 The Bias Impact of SDC Procedures.- 3.4 Impact of SDC Procedures on Methods of Estimation.- 3.5 Information Loss Measures Based on Entropy.- 3.5.1 Local Recoding.- 3.5.2 Local Suppression.- 3.5.3 Global Recoding.- 3.5.4 PRAM.- 3.5.5 Data Swapping.- 3.5.6 Adding Noise.- 3.5.7 Rounding.- 3.5.8 Microaggregation.- 3.6 Alternative Information Loss Measures.- 3.6.1 Subjective Measures for Non-perturbative SDC Techniques.- 3.6.2 Subjective Measures for Perturbative SDC Techniques.- 3.6.3 Flow Measure for PRAM.- 3.7 MSP for Microdata.- 4 Application of Non-Perturbative SDC Techniques for Microdata.- 4.1 Introduction.- 4.2 Local Suppression.- 4.2.1 MINUCs Introduced.- 4.2.2 Minimizing the Number of Local Suppressions.- 4.2.3 Minimizing the Number of Different Suppressed Categories.- 4.2.4 Extended Local Suppression Models.- 4.2.5 MINUCs and -ARGUS.- 4.3 Global Recoding.- 4.3.1 Free Global Recoding.- 4.3.2 Precoded Global Recoding.- 4.4 Global Recoding and Local Suppression Combined.- 5 Application of Perturbative SDC Techniques for Microdata.- 5.1 Introduction.- 5.2 Overview.- 5.3 Adding Noise.- 5.4 Rounding.- 5.4.1 Univariate Deterministic Rounding.- 5.4.2 Univariate Stochastic Rounding.- 5.4.3 Multivariate Rounding.- 5.5 Derivation of PRAM Matrices.- 5.5.1 Preparations.- 5.5.2 Model I: A Two-step Model.- 5.5.3 Model II: A One-step Model.- 5.5.4 Two-stage PRAM.- 5.5.5 Construction of PRAM Matrices.- 5.5.6 Some Comments on PRAM.- 5.6 Data Swapping.- 5.7 Adjustment Weights.- 5.7.1 Disclosing Poststrata.- 5.7.2 Disclosure for Multiplicative Weighting.- 5.7.3 Disclosure Control for Poststrata.- 6 Disclosure Risk for Tabular Data.- 6.1 Introduction.- 6.2 Disclosur e Risk for Tables of Magnitude Tables.- 6.2.1 Linear Sensitivity Measures.- 6.2.2 Dominance Rule.- 6.2.3 Prior-p ost erior Rule.- 6.2.4 Intruder's Knowledge of the Sensitivi ty Crit erion Used.- 6.2.5 Magnitude Tab les from a Sample.- 6.3 Disclosure Risk for Frequency Count Tables.- 6.3.1 Frequency Count Tables Based on a Complete Enumeration.- 6.3.2 Frequency Count Tables Based on Sample Data.- 6.4 Linked Tables.- 6.5 Protection Intervals for Sensitive Cells.- 6.6 Sensitivity Rules for General Tables.- 7.2 Information Loss Based on Cell Weights.- 7.2.1 Secondary Cell Suppression.- 7.2.2 Rounding.- 7.2.3 Table Redesign.- 7.3 MSP for Tables.- 7.3.1 Table Redesign.- 7.3.2 Secondary Cell Suppression.- 7.3.3 Rounding.- 7.4 Entropy Considerations.- 7.4.1 Some General Remarks.- 7.4.2 Tabulation.- 7.4.3 Cell Suppression.- 7.4.4 Table Redesign.- 7.4.5 Rounding.- 8 Application of Non-Perturbative Techniques for Tabular Data.- 8.1 Introduction.- 8.2 Table Redesign.- 8.3 Cell Suppression.- 8.4 Some Additional Cell Suppression Terminology.- 8.4.1 The Zero-Extended Table.- 8.4.2 Paths, Cycles and Their Cells.- 8.4.3 Network Formulation for Two-dimensional Tables.- 8.5 Hypercube Method.- 8.6 Secondary Suppression as an LP-Problem.- 8.6.1 The Underlying Idea.- 8.7 Secondary Suppression as a MIP.- 8.7.1 Lougee-Heimer's Model.- 8.7.2 Kelly's Model.- 8.7.3 Geurts' Model.- 8.7.4 Fischetti and Salazar's Model.- 8.7.5 Partial Cell Suppression.- 8.8 Cell Suppression in Linked Tables.- 8.8.1 Top-Down Approach.- 8.8.2 Approach Based on MIP.- 8.9 Cell Suppression in General Two-Dimensional Tables.- 8.10 Cell Suppression in General Three-Dimensional Tables.- 8.11 Comments on Cell Suppression.- 9 Application of Perturbative Techniques for Tabular Data.- 9.1 Introduction.- 9.2 Adding Noise.- 9.3 Unrestricted Rounding.- 9.3.1 Deterministic Rounding.- 9.3.2 Sto chastic Rounding.- 9.4 Controlled Rounding.- 9.4.1 Controlled Rounding in One-Dimensional Tables.- 9.4.2 Controlled Rounding in Two-dimensional Tables.- 9.5 Controlled Rounding by Means of Simulated Annealing.- 9.5.1 Simulated Annealing.- 9.5.2 Applying Simulated Annealing to the Controlled Rounding Problem.- 9.6 Controlled Rounding as a MIP.- 9.6.1 The Controlled Rounding Problem for Two-dimensional Tables.- 9.6.2 The Controlled Rounding Problem for Three-dimensional Tables.- 9.7 Linked Tables.- 9.7.1 Rounding in Linked Tables.- 9.7.2 Source Data Perturbation.- References.

「Nielsen BookData」 より

関連文献: 1件中  1-1を表示

詳細情報

ページトップへ