Infoshare Online

Infoshare FAQs

Infoshare Data

What do Infoshare tables look like?
What is the source of Infoshare data?
How does Infoshare provide information for geographic areas when data are not published for such areas?
Why may some % values in Module 1 be incorrect?
What does the first column in Module 2 represent?

Infoshare Geographies

What is a "NYC Neighborhood"?
What is a "Neighborhood Tabulation Area"?
How can I compare data for a local neighborhood in New York City with corresponding data for the Borough and City?
How do I find out which Zip Codes are in a Community District or other types of geographic areas?
When I add the values for School Districts, City Council Districts, etc., they don't always equal the Borough, City, County, or State totals. Why is that?
Do you have maps of the geographic areas in Infoshare?

Demographic Data

Why is some data missing from the Census?
What is the Margin of Error, or MOE? How is it calculated?
What is the 'adjusted' "Hispanic Population by National Origin"?
How do I convert dollar values from earlier census data to current values?
What is the "poverty level"?
What does "n.e.c." mean?
Why do I find a lot of 5s in the immigration data?
Why does immigration appear to be less than we "observe" in our neighborhoods? And why does immigration appear to decline about thirty percent for 1997 and 1998?

Health Data

Why is the Birth and Death data available only for multi-year periods after 1998?
What is the difference between "Persons" and "Admissions" in the Hospital Admissions file?
What are ICD-9 Codes? How can I find data on hospitalizations for asthma or drug abuse?
Why do I find a lot of 3s and 5s in Birth, Death, and Hospital Admission tables and in the AIDS data?



What do Infoshare tables look like?
To see examples of Infoshare tables, click here to see an example of a Profile or a Comparison.

What is the source of Infoshare data?
All data files are obtained from City, State, and Federal government agencies. Community Studies of New York is in constant contact with these agencies, and incorporates the newest data as soon as it becomes available. We then convert into a standard form which can be displayed in the Infoshare Online system.

How does Infoshare provide information for geographic areas when data are not published for such areas?
Community Studies obtains from its data sources (government agencies, some commercial firms) data at the smallest geographic area at which it is publicly available, usually census tracts and zip codes. To provide data for other geographies, we have, over the past decade, developed a series of overlap factors which allow us to convert this small-scale data into these larger areas of special interest. Medians, means, and per capita values for larger geographies are computed using weighted averages of these values for the smaller areas of which they are composed. For instance, medians for zip codes are calculated by using weighted averages of the medians for the census tracts which compose each zip code, utilizing the overlap of each census tract with each zip code.
Such overlap factors are based upon the distribution of residential housing, when this is available. In that case, the overlaps approximate as closely as possible the distribution of the population by residence. Where this residential data is not available, geographic overlap factors are derived using standard geographic mapping overlays.
Overlap factors (for instance, the percentage of a zip code in a particular State Assembly District) are maintained to three significant figure accuracy (e.g., 12.3%). Therefore, data which may appear to have more than three significant figures should be treated carefully. In general, census tract, zip code, borough-wide, and City-wide data will be fully accurate, but data for other geographic areas will only be as accurate as the overlap factors. These are generally expected to be accurate to 1-3%.


Why may some % values in Module 1 be incorrect?
The % column for a table in Module 1 is computed using a standard algorithm (or rule) under which all the values in the table for which percentages are shown are summed, and then the percent is computed for each data item using that total. (Some data items may be omitted because it is clear that they should not be included; for instance, they might duplicate another item in the table.) This simple rule will work for many tables, but there are many for which it gives incorrect or misleading answers, for instance, tables where there are groups of data items that should be treated separately when computing percentages. USE THE % COLUMN WITH CARE!
When I add the values I find for School Districts, City Council Districts, etc., they don't always equal the Borough, City, County, or State totals. Why is that?
Community Studies obtains from its data sources (government agencies, some commercial firms) data at the smallest geographic area at which it is publicly available, usually census tracts and zip codes. To provide data for other geographies, we have, over the past decade, developed a series of overlap factors which convert this small-scale data into these larger areas of special interest. Such overlap factors are based upon the distribution of residential housing, when this is available. In that case, the overlaps approximate as closely as possible the distribution of the population by residence. Where this residential data is not available, geographic overlap factors are derived using standard geographic mapping overlays.
Because this data cannot be exact, there are small discrepancies in areas such as City Council Districts for which the data was obtained using an overlap function. Overlap factors (for instance, the percentage of a zip code in a particular State Assembly District) are maintained to three significant figure accuracy (e.g., 12.3%). Therefore, data which may appear to have more than three significant figures should be treated carefully. In general, census tract, zip code, borough-wide, and City-wide data will be fully accurate, but data for other geographic areas will only be as accurate as the overlap factors. These are generally expected to be accurate to 1-3%.

How do I find out which Zip Codes are in a Community District or other types of geographic areas?
From the Main Menu, select the tab <Area Comparison> or the link for <Module 2- Compare Areas Using Selected Data>. For <Areas to Compare>, select <Zip Code>. For <Coverage Region Type>, select <Community District> (or the type of area in you want). Choose the Community District or other area you wish to view. For <Data File>, <Table>, and <Data> choose any option you wish, e.g., Demographics, <2000 Census>, <Total Population>. Then view your table. In the left two columns of your table you will see listed the zip codes within your chosen area and the proportion of each zip code that is included in that area.
To apply this to any other type of overlap, substitute the smaller geographic area you wish to see for Zip Code and the Coverage Region for Community District, e.g., Census Tracts within a Zip Code; School Districts within a Borough; or Zip Codes within a Police Precinct.

What does the first column in Module 2 represent?
The first column in any table produced in Module 2 will be labeled 'MapID'. This column contains values that will be recognized by your mapping software if you choose to map the data. If you are not using the tables for mapping, simply delete the first column from any file you save.

What is a "NYC Neighborhood"?
This is one of 292 neighborhoods in which New Yorkers generally think of themselves as residing. They are not precisely defined, and no government agency has specified official boundaries for them. Nevertheless, in the early 1990s an informal task force in the NYC Department of City Planning drew boundaries for them, and we are using these boundaries. In spite of their lack of official definition, these areas are useful, simply because they are the neighborhoods in which residents believe they live. To view or print a Map of NYC Neighborhoods in an Adobe Acrobat .pdf file, click NYC Neighborhood Map.

What is a "Neighborhood Tabulation Area"?
These 195 areas, usually referred to as NTAs, have been defined by the NYC Department of City Planning in order to produce projections for social and environmental planning purposes. They are defined as combinations of census tracts having populations of at least 15,000 people and lying within one of the Sub-borough Areas or PUMAs. To view or print a Map of NTAs in an Adobe Acrobat .pdf file, click Neighborhood Tabulation Area Map.

How can I compare data for a local area in New York City with corresponding data for the Borough and City (or a local area in New York State with the County and State values, or a County with the State and the Nation)?
Use <Module 2 - Area Comparison> and select, for both <Areas to Compare> and <Coverage Region Type>, the type of area you want to examine (e.g., City Council District). Select the particular area you want to examine (e.g., City Council District 14). Then choose the data elements you are interested in, as you would normally in Module 2. You will get a three-row table that shows the Citywide values, the Borough values, and the neighborhood values for each of these data elements.

Why is some data missing from the Census?
For some data items and some smaller geographic areas in the Census Bureau's American Community Survey, no data appears. This is because there were too few cases in the Census Bureau's survey sample to provide a reliable count. The data may be present in larger areas (e.g., county or borough), but not in smaller ones.
What is the Margin of Error, or MOE? How is it calculated?
Data values in the American Community Survey are based on a sample of the population in each neighborhood. They are therefore subject to sampling variation. This leads to some degree of uncertainty in any estimate, represented through the use of a Margin of Error (MOE) associated with each data element. The value shown in Infoshare for the data in Modules 1 and 2, containing the tables published by the Census Bureau, is the 90 percent margin of error. This is defined such that the true value of any data element has a 90% probability of being contained within the interval encompassed by the value of the data element plus or minus the Margin of Error (shown in parenthesis in Infoshare). For those familiar with common statistical methods, the Margin of Error is equal to 1.645 multiplied by the Standard Error. The Census Bureau provides the Margin of Error for data at the census tract, county, city, and state levels. For all other areas, the Margin of Error shown in Infoshare is estimated by taking the square root of the sum of the squares of the product of the margin of error for each census tract multiplied by the fractional overlap of the census tract with the area of concern. In equation form, MOE = SQUAREROOT(SUM(i)[Census Tract MOE(i) * Overlap(i)]^2). This provides an approximate estimate of the MOE. It neglects, for instance, any correlation among the census tract MOEs for the tracts that make up the area.
What is the 'adjusted' "Hispanic Population by National Origin"?
The 2000 Census asked those who indicated they were of Hispanic origin to specify their particular country of origin or their family's country of origin. However, the question was apparently unclear to many people, and a large number failed to specify their origin. Thus, except for Mexican, Puerto Rican, and Cuban, for which check boxes were given, the numbers of people indicating they were of Dominican, Salvadoran, Colombian, etc., origin was much smaller than expected.
To correct for this, we have included in Infoshare an adjusted distribution of Hispanic ethnic origin, using a method suggested by John Logan of the University at Albany (see "The New Latinos: Who They Are, Where They Are", Lewis Mumford Center for Comparative Urban and Regional Research, University at Albany, Sept. 10, 2001). In the original data, as provided by the Census Bureau, the "Other Hispanic" category includes all those who failed to indicate their ethnic origin. It is thus much larger than it should be. Logan estimates, using other Census surveys, that for New York State the "Other Hispanic" category should be no more than 2.4% of the total. Using this, along with the Census values for each ethnicity, and weighting the Mexican, Puerto Rican, and Cuban correction at 1/10 of the rest, since they were shown explicitly and not as easily skipped, we adjusted the values in proportion to the values shown by the Census Bureau.

Why do I find a lot of 3s and 5s in Birth, Death, and Hospital Admission tables and in the AIDS data?
In order to maintain the confidentiality of Birth, Death, and Hospital Admission records, values between 1 and 5 cases and, more recently, between 1 and 10 cases, are suppressed in Birth, Death, and Hospital Admission tables, and values smaller than 5 are suppressed in the AIDS data. The values for "combination" geographic areas (e.g., legislative districts) in Modules 1 and 2, whose values are obtained from the values for smaller areas (e.g., census tracts or zip codes), as well as the row and column totals in Module 3, are also randomized (by adding a random number between 1 and 10) to preserve confidentiality when these include census tracts or zip codes with fewer than 10 cases. In order to avoid creating errors when users import these tables into spreadsheets and other software packages, a "3" or "5" is placed into these cells instead of some character such as an asterisk. If you do not want these "false" numbers to appear in your tables, simply replace them with an *.

How do I convert dollar values from earlier Census data to current values?
The Census asks respondents to report their income for the previous calendar year (e.g., for 2009 in the 2010 survey). The inflation factor to convert earlier to current values generally uses the Consumer Price Index, which can be found at www.bls.gov/data/inflation_calculator.htm; multiply dollars for the year they refer to by this factor to obtain their equivalent in current dollars.

What is the "poverty level"?
The "poverty level" is a set of money income thresholds that vary by family size and composition to provide a criterion for who is poor. If the total income for a family or unrelated individual falls below the relevant poverty threshold, then the family or unrelated individual is classified as being "below the poverty level." In 2014 the poverty levels, for difference family sizes and compositions are as follows: 2017 POVERTY GUIDELINES FOR THE 48 CONTIGUOUS STATES AND THE DISTRICT OF COLUMBIA Persons in family/household Poverty guideline For families/households with more than 8 persons, add $4,060 for each additional person. 1 $12,060 2 16,240 3 20,420 4 24,600 5 28,780 6 32,960 7 37,140 8 41,320 For families/households with more than 8 persons, add $4,060 for each additional person. Source: https://www.federalregister.gov/articles/2014/01/22/2014-01303/annual-update-of-the-hhs-poverty-guidelines

What does "n.e.c." mean?
"n.e.c." means "not elsewhere classified", that is miscellaneous, not included in other specified categories.

Why do I find a lot of 5s in the immigration data?
The U.S. Department of Homeland Security, which now runs U.S. immigration services, has recently imposed confidentiality restrictions on the release of immigration data. For the years 1999 on, the actual number of cases in an area cannot be disclosed publicly if there are fewer than 10 cases. To avoid creating problems for users when they import these tables into spreadsheets and other software packages, a "5" is placed into these cells (instead of some character such as an asterisk) when there are between 1 and 9 cases..

Why does immigration appear to be less than we "observe" in our neighborhoods? And why does immigration decline about thirty percent for 1997 and 1998?
Immigration data is obtained from the U.S. Department of Homeland Security, which now runs the immigration services. All immigrants are asked for the zip code of their host or their expected residence. It includes only legal immigration which is generally estimated to be about one half of all immigration into the U.S. The INS reports that the apparent decline in 1997 and 1998 reflects not a real decline in immigration but problems the INS encountered in processing applications.

What is the difference between "Persons" and "Admissions" in the Hospital Admissions file?
The "Persons" selection gives the number of different individuals admitted for a particular diagnosis. Individuals are counted only once when they are admitted for the same 3-digit ICD-9 diagnosis code. That is, if an individual is admitted two or more times with different diagnoses, that person will be counted each time, but repeated admissions of the same person for the same diagnosis will be counted only once. "Admissions", on the other hand, count all inpatient admissions, even when the same person is admitted repeatedly for the same diagnosis.

What are ICD-9 Codes? How can I find data on hospitalizations for asthma or drug abuse?
The 9th Revision of the International Classification of Diseases (ICD-9) classifies diseases and injuries into major grouping or Chapters. There are 17 Chapters plus the V Codes, supplementary classifications relating to health status and use of health services. The ICD-9 Chapters are further subdivided into Sections, which are groupings of diseases identified by three-digit codes. ThereThere are 121 Sections and 999 three-digit codes, plus 82 two-digit V-codes. As an example, the first Chapter, Infectious and Parasitic Diseases, is divided into 15 Sections including Intestinal Infectious Diseases (Codes 001-009), Tuberculosis (010-018), Zoonotic bacterial diseases (020-027), etc.
For Drug Abuse, look under the ICD-9 Section for "Neurotic and other Nonpsychotic Mental Disorders (300-316)". Asthma and other pulmonary diseases can be found in "Chronic Obstructive Pulmonary Disease and Allied Conditions (490-496)".

Why is the Birth and Death data available only for multi-year periods after 1998?
The New York City Department of Health and Mental Hygiene maintains the vital statistics records for New York City, while the New York State Department of Health keeps these records for the rest of the state. (New York is the only state with such an arrangement.) For the first time in many years, the New York City Department of Health and Mental Hygiene is making vital statistics data available for small local areas (census tracts and zip codes). Since the advent of the internet in the late 1990s, the Department has had great concern about threats to the confidentiality of the data. It has now worked out an approach that allows this data to be made publicly available, though in certain limited formats. The Department is making the zip code data available in aggregated three-year groups, and the census tract data in aggregated five-year groups. While this presents some limitations to the usefulness of the data, the availability of this data is most welcome. It will help students, researchers, and advocates assess local conditions that are revealed by this newly-available data. Totals and Averages Infoshare provides both three- and five-year totals, as the Department supplies them, and three- and five-year averages, obtained simply by dividing and rounding-off the total values by three or five, respectively). This should make it easier for many users to work with the data. Large-area Data Infoshare uses the five-year groups of census tract data, with our geographic overlap methodology, to generate values for larger geographic areas (community districts, NYC neighborhoods, legislative districts). We have also generated values for those larger areas using the three-year groups of zip code data, even though the overlap methodology is not as accurate when using the much larger zip codes to generate data for these other geographic areas. However, we felt that the three-year groups for these data might make them more useful in some cases. Data Not Available for Five-year Period 2008-2012 The five-year data for 2008-2012 is not available. This data is based on census tract counts, and the definitions of census tracts were changed during that period as a result of the 2010 census. Data Suppression To assure confidentiality of the data, the Department has suppressed cells containing fewer than five cases, including zero cases, over the three- or five-year periods. In order to be able to display the data, and to use it to generate values for larger geographic areas, we have inserted the value of 2 into those suppressed cells.

Do you have maps of the geographic areas in Infoshare?
Yes, we will shortly provide for Infoshare subscribers the ability to create maps using Infoshare Online, as well as boundary files for all the areas in Infoshare. These files will allow users to prepare their own maps in MapInfo, AtlasGIS, Maptitude, and ArcView. Soon, you will be able to create maps directly on this site as well.

Source: https://www.federalregister.gov/articles/2014/01/22/2014-01303/annual-update-of-the-hhs-poverty-guidelines