MCHB Conference Webcasts downloads audio slides transcripts
Using Geographic Information System (GIS) to Analyze MCH EPI Data

MCHB/EPI Miami Training — December 5 - 6, 2005

Introduction to GIS & Exploratory Spacial Data Analysis (ESDA) — Transcript

 

RUSSELL KIRBY: I'm Russell Kirby, and Dr. Sharma and I are going to be your lead trainers, but we also have two additional who are going to be around. So we'll have four people for hands-on as we go through the day. Just wanted to tell you a little bit about this really cool folder that you have here. It has some useful information in it. Among other things, it has a list of all of you, unless there's some new people. So you can know how to get in touch with each other afterwards. There's also a one-page bibliography which we provided that you picked up at the table that gives some background references. We've also included a copy of the ARC GIS primer that we received from the Oregon health department which I think were emailed to everyone ahead of the session.

And then we've included some of the presentations, the ones that we had ready ahead of time, are included here. And then we've included some articles that are more general background information that might be of use as well. Also wanted to let you know, I brought some manuals. I didn't know where to put them. So I put them up here in front. But I brought some of the ESREY manuals so if people need to refer to things today, today or tomorrow, that they're up here in front. I only have one copy of each of the three books I brought. Then I brought some textbooks as well, if anyone wants to look at those as we go through the day. So what I wanted to do to get us started is — Ravi , can you make it so the PowerPoint comes up on the screen. It should be on the tool bar at the bottom. If you bring up the bar at the bottom. I had it loaded already.

What I thought would be good to get started is for us to introduce ourselves to everyone because I know many of you but not everyone we all know from each other from the different states. I thought one of the reasons we might do this is if as you introduce yourselves if you would briefly answer these questions it will also help us, because it's possible that we have some attendees who have different incoming skill levels and we particularly want to find out if there's people who have extremely high or lower level so that we can pair people up so that the training is as effective as possible. And I'm going to start, if I could, and I'll tell you who I am and my place of employment and all this stuff.

I'm Russell Kirby, professor of Maternal Child Health at the University of Alabama at Birmingham , and I happen to have a doctorate in geography. I earned it a long time ago. Almost in the prehistoric era. In other words, before there was anything called GIS.

But I've kept up with the field ever since. So in terms of how I currently use or plan to use GIS, I actually see Geographic Information Systems as one of the essential tools to do epidemiology. I think it's one of the core tools that we need to have, particularly if we're doing population-based data analysis. And I could spend a whole day seminar talking about why that is so and how that might work. But I think it's one of the essential tools. And I'm very delighted that we're having the chance to do this training and increase the level of skill and knowledge about GIS for MCH epidemiologists.

Now in terms of experience with ARC View and other GIS software I've actually been using different versions of ARC for a long time. I have not as much experience actually with the current version, although the skills tend to translate from one version to the next. But I've used ARC Info back when it was a mainframe application and five or six different versions of ARC View when it was on a PC and a little bit with the current version.

And one thing I plan to do in my spare time while in Miami Beach , I am hoping I can find at least one person who will go out on the beach and play Frisbee with me. I brought a Frisbee along to the meeting. So with that, I think what I'll do is I'll let Ravi introduce himself and then we'll start here and then just go through the room and we'll pick up Dianne and Z as we get to them.

After you introduce yourself, I'll pass the microphone around for everyone else.

RAVI SHARMA: So what you all have to do for me is stand up. Put your hands together like this. Traditional Indian greeting. Nomastay. The divine in me greets the divine in each one of you. Thank you.

I'm Ravi Sharma, from the University of Pittsburgh , from the Graduate School of Public Health, in the Department of Behavior and Community Health Sciences. I'm actually a demographer by background. But all of my work has been with maternal and child health, specifically looking at the term and preterm and low birth weight and one of the things those of you who are geographically challenged, one of the things I'm going to be using a lot today is a lot of information and geographic data from Allegheny County and from Pennsylvania. So those who don't know enough about Allegheny, by the way I'm not being paid for by the Pittsburgh Visitors Bureau, this is on my own. But I hope you will get to learn a lot about Pennsylvania and Allegheny County , in particular, which is the county I live in. And Pittsburgh the city I live in. I live right in the city.

And I teach two courses. One of the courses, which is going on now I just have to leave in the middle, is called Application of GIS and Spatial Data Analysis and Public Health. My students are mostly Ph.D. students in epidemiology and biostatistics and students in behavior and community health sciences. So the course going on and fortunately it's towards the end of the term so the students are writing their papers and their presentations, and I can sneak away. It's 20 degrees in Pittsburgh .

This is a great place to be. You land here its 70 degrees and it's still the same country. It's wonderful.

Let's see, am I missing anything? So the place of employment I did. Currently planning to use GIS. I work very closely with my, with the public health department, called the Allegheny Public Health Department, with the director. And I've done a lot of work, if you recall Healthy Start, back in the '90s, I think it's still around in some way shape or form. I did comprehensive maternal and child health assessment for them to get the program funded. And after that I faded into the background. But I've continued to do you know all the assessments for them. And my latest assignment has been to a principal investigator for behavior risk factor assessment of Allegheny County . And as you know there are very few counties that have actually done a behavior risk assessment. Most of these, the CDC type, are at the state level. I think there are a few counties that have done it. And we just also got another grant at university of Pittsburgh . It's a CDC grant. It's the Academic Excellence Center for Environmental Health Tracking, something like that. I'm the GIS, I provide the GIS expertise on that grant.

So I work very closely with my public health department. We have two groups. One is we have as you know the Rand Corporation has a satellite in Pittsburgh . So I work on their perinatal group. And Allegheny Health Department has a perinatal group and I'm on their perinatal group and we meet often to consider different issues and research. So when I get back on plant, we actually have a perinatal group meeting.

This is what the perinatal group meeting is, a multi disciplinary group meeting. It has faculties from nursing, public health, (inaudible) Women's Hospital, which is a huge hospital, and we all get together once or twice a month to work on different issues. I also work abroad, by the way, in women care programs. I've worked in Russia . Developed the Women's Wellness Center in Danask. It's a coal mining region of Ukraine . Done the same work in Albania in a place called Zakine Island . Do you know where that is? It used to belong to Japanese, part of Japan , and you know the spoils of war, the Russians got it so now it's part of Russia . It's far east. It takes almost a day to reach there. So I have done work there, too. Again in the MCH area. I think that's about it for me.

UNKNOWN SPEAKER: So we'll start with Tree and speak into the microphone.

Good morning, everybody. My name is (inaudible). I live in the Louisiana (inaudible) by (inaudible) but my house, okay, you know I have some (inaudible) but no flood inside. I'm lucky. So I'm assistant professor (inaudible) research in Louisiana State University School of Medicine (inaudible) Maternal Child Health Program. I've been using GIS software ARC View version 3.2 a long time ago, like three years ago. Now I'm using (inaudible) infant mortality by (inaudible) in Louisiana . So I know (inaudible) but I don't know most because I use a long time ago for ARC View. But it's not for — it's familiar with me for (inaudible). After break time, after time for the training I want to go through the beach with somebody else to enjoy the beach. Thank you.

My name is Evelyn Torres. I'm the SSDI coordinator from MCH Division Puerto Rico Department of Health. And my English is too bad. I'm sorry. And I never use GIS version 9. I use (inaudible) GIS in 1994 and ARC View 3.2 in my masters dissertation. And in my — I will plan to go to the shopping.


MARILYN KENNEDY: Hi my name is Marilyn Kennedy. I'm with the Minnesota Department of Health in Maternal and Child Health and Community and Family Health. My biggest experience with this type of geographic analysis has been with geo coding and that was with an earlier version, working with mental illness and mental health. Currently I'm working with perinatal issues and poverty issues in Minnesota where we've been 98% white for about a trillion years. We're now getting a huge, huge influx of all kinds of different minorities and people of color. So we would like to use GIS in terms of mapping those needs and coming up to speed and doing justice to our newly arriving populations of color.

And also my background has been in sociology and social welfare. Spend many years as a clinical social worker, and then moved on to public health after I got my Ph.D..

One of the things I want to do here is go for lots of walks on the beach.

LIONEL WHITE: Hello. My name is Lionel White. I would, for the Massachusetts Department of Public Health, bureau of Family and Community Health. I use the program, well, my experience I've used two versions of Map Info Professional 6.5 and 7.0. But that was maybe two years ago when for the last year I've been using ARC View 9. So I have a good amount of experience. And I use the map demographic population data. And I also plan to go to the beach hopefully maybe twice since I'm here.

LLOYD MULLER: Hi. My name is Lloyd Muller, I'm an epidemiologist with the Connecticut State Health Department. And I coordinate analysis of birth/death and hospitalization data there. I've had some use of ARC View but it's one of the tools in a series of tools that we use for conducting surveillance. But typical things we do with it are creating thematic maps which are largely descriptive but we also create thematic maps that summarize analytic results, whether or not they're significant changes or differences than, say, Healthy People 2000 reference numbers. I'm glad there's going to be some discussion of spatial filtering here. That's one of the things that SAT Scan does and other people have other techniques for doing. But it's I think a really interesting thing that begins to look at clusters across political boundaries. And I guess that's not on the list. So I'm not supposed to say that. But it is one of the things I'm looking forward to. Also I'm looking forward to swimming in the ocean. That's something that will happen. Hopefully. That's good.

CAROL STONE: Hi. I'm Carol Stone. I'm from the Connecticut Department of Health division of family health. I've been currently using this, in fact I brought with me some files that I hope nobody minds I uploaded to my compute ear here so I thought I could incorporate some of the things I learn here today on a current project I'm working on that happened to be with school-based health centers and identifying some needs within the towns there. And one of my interests, just taking a lovely stroll down the road and feeling the sun on my face.

UNKNOWN SPEAKER: Good morning, everyone. My name is Leijo. I'm the director of the health director services from the Mississippi Health Department. One of the big areas that we (inaudible) support and data analysis at Maternal Child Health. For the past two years I've been using the GIS 8.3. And we have some experience — we had a project we called the Closing the Gap between infant mortality, was one of the first states in the country that got that grant. So we right now the health department tried to collaborate with another bureau, which we call the Vital Statistics. We have GIS teams. So hopefully we can work together on some Maternal Child Health issues. And in my spare time I try to find good Chinese restaurant and see if more authentic than Mississippi restaurants. Thank you.


JANE MEYER: I'm Jane Meyer. I'm with the State of Texas Department of Health Services. I have eight or nine years of experience with ARC View but with the very early version just yet. So I have experience. We've done annual analyses of demographic and MCH data for the Title V grants in our area as well as I'm sure all of you are connected with doing the needs assessments. And for something I plan to do in my spare time I'm looking for Miami architecture. So anybody who has got a Miami tour lined up, let me know.

CHRIS WALDRON: Hello. I'm Chris Waldron with the Indiana State Department of Health. I've been using ARC GIS software for the past eight years or so. We do a lot of geo coding at the state and I guess I'm here to find out exactly new ways or old ways to analyze our data. And I guess I want to see a sunrise over the ocean.

JACK KENNEDY: I'm Jack Kennedy. I'm with the Division of Specialized Care for Children in Springfield , Illinois . And we've been using ARC View to map our providers that serve the client, our children and our care coordinators have been looking at that data and sort of seeing where they are so they can get our children to those providers that are closer to them. We've also been mapping our clinics in the area and taking a look at that and using the data as to the individuals that come to the clinics and doing some analysis with that.

I've been using ARC View 9 probably for about a year now, and every time I think I know the program, then I find out that I don't. And if I'm away from it for a little bit then I have to go through a learning process again to figure out what I forgot.

What I'm going to do here is enjoy the sunshine. It was 25 degrees when I left Springfield , and I tell you, this is great.

CAROL MOORE: I'm Carol Moore from Kansas , from Kansas Department of Health And Environment, the Bureau of Children Youth and Families. We currently use ARC View, I've been using ARC View 9.1 and trying to teach other people how to use it for a while now. Before then we had 3. I don't know how we jumped from 3 to 9, but we did. And we use it — I use it mainly to just show people, okay, you show them some stats if you visualize it with a map it's helpful for a variety of different people from fellow employees to state organizations, to better understand the data.

We've used a very simple level looking by county anxious to learn other ways to use GIS. In Miami what I'll mainly be doing is working on a PRAMS application, but occasionally hoping to get a walk in on the beach.

SUSAN ELDER: Good morning. I'm Susan Elder, I'm from the State of New Mexico . GIS is used at a descriptive level, pretty extensively in our state in the vital records group, for example, where they show a variety of birth outcomes by county and so forth. Currently I'm hoping to use GIS to do such things as map PRAMS data by county superimposed on the major employers, because we're looking at workplace policies, for example, and breast feeding continuation.

Some other uses are smoking mapped against the smoke-free cities of New Mexico to look at smoking amongst women and so forth. And the first project actually that I'll do when I go back is to just do a descriptive pinpoint mapping of WIC clients in two of the big counties in New Mexico that are sparsely populated so that the WIC program can best determine where to put their mobile sites, to better serve the population. So I think that will be good clean fun.

My experience, I am very high end user of the National Geographic Atlas of maps with a magnifying glass and I'm a somewhat experienced user of such things as EPI Map and so I'm just tickled pink to now get a little more sophisticated and hope I can keep up.

I'm looking for anyone who would like to get in a taxi with me and go find Little Haiti. I did some work in Haiti years ago and I want to go eat Haitian food and find some music. If you've never been exposed, you cannot stay in your chair once it gets going. So that's what I want to do. This works good.

UNKNOWN SPEAKER: Hello. Is this on?

UNKNOWN SPEAKER: Is this working?

UNKNOWN SPEAKER: I'm Cheryl (inaudible) I'm from — I work with the infant (inaudible) unit. I've been working with GIS for about five years but certainly not an expert because I go in and out like the gentleman said over here. The last project I did was for a county next to Leon County where Tallahassee is located. It was mapping clients, state clients to determine the route of a bus route that they want to design. So I'm more functional. I'm probably the only one in our division doing it right now. If some GIS projects come through they come to me through training but I wouldn't consider myself being the expert. I live in Florida . The beach, been there, done that. Plus with the hurricanes I don't like water. I probably have two papers to write because I'm in the DRPH program and I probably will go shopping with my friends when they get here on Wednesday.

ROB SATTERFIELD: I'm Rob Satterfield, with the Utah Department of Health. Also the epidemiologist for CHSCN in our state. Currently we're mapping our ADAM, autism data, with environmental data from our state. My experience level is I've had several ESRY classes but not a lot of practical experience. And I'm also a PADI scuba instructor so I'm hoping to get wet at some point.

UNKNOWN SPEAKER: My name is (inaudible) and I work with the North Carolina Division of Public Health, in the nutrition services branch. I usually work with WIC population data and have been in North Carolina for about ten years. During this time we have been using county-specific data for mapping the county level. But we get a lot of help from Diane Wright who is here because they have a very good GIS lab. Until now we didn't feel the need of putting GIS at our unit, but we plan to do it because we are expanding our surveillance system. And the surveillance system right now we have already developed the database and we are collecting the data from the WIC population in children's and youth program. Child obesity is a big problem in North Carolina . So we'd like to map the child obesity by zip code level and maybe find out, you know, the problems that we are having.

What else? I've used the SAS program to map some data but not the zip code or (inaudible) level. So I guess I'll be able to learn something new here today. And at least one thing, actually I like — I love eating. So I'd like to find some good restaurant around here. I walked about 15 minutes yesterday and couldn't find any. So maybe that would be my adventure.

DEREK CHAPMAN: I'm Derek Chapman, Assistant Professor of Epidemiology and Community Health at Virginia Commonwealth University , and thanks to SSDI funding actually spend most of my time at the Virginia Department of Health as their MCH epidemiologist. We've had a lot of — we have a lot of staff that started using GIS at the health department doing basic county level maps, replacing ugly tables and reports. But they and I have no experience with any spatial data analyses or smoothing of rates or any of the auto correlation or anything like that. So I'd like to move us forward from simple mapping to doing some actual analysis and looking at our birth defects, low birth weight, et cetera. So I've been using ARC View for about three years, about whenever 8.1 came out I started using it. And I also will be working on a PRAMS application in my free time.

ANGIE CARLSONBURG: Hi, I'm Angie Carlsonburg. Work at the Wyoming Department of Health in Community and Family Health, and I don't use GIS. I used EPI and FO to map a few things but I'm using a GIS tutorial at work and trying to work through it. And so I don't have very much experience, but I hope to get a lot of great ideas here today. And the one thing I'd like to do here is go to the beach every day.

AMANDA KATIE YARBORROW: My name is Amanda Katie Yarborrow with the local health department in Madison , Wisconsin . I've used GIS to make maps but I'm hoping to kind of learn how to use it to analyze data instead of just making pretty maps. And I hope to get down to the beach while I'm here.

UNKNOWN SPEAKER: Good morning. My name is Qauni Borrow (ph). I'm with the State of Tennessee Maternal Child Health (inaudible) with GIS. We don't have (inaudible) where they have one person here so I'm here to learn GIS and be able to help in analyzing maternal and child health data (inaudible) fatality, to come up with analysis. So far I think I'll be here and be, it's better to learn as much as I can about my state and health, the people who are there and hopefully I hope that I'll be able to work (inaudible) and also go in the water.

DIANE ANN WRIGHT: I'm Diane Ann Wright. I work for the state of North Carolina Division of Public Health, State Center for Health Statistics, and the State Center for Health Statistics we have the Cancer Registry, Birth Defects Monitoring Program and Vital Statistics and health services. I manage the health and spatial analysis unit. I worked for GIS for many years, over 11 years. I've been with the state of North Carolina for 11 years, been managing the unit for the last three years. Like Naj said, we also provide a lot of services to women's and children's health and also I do a lot of work with communicable diseases, one of my biggest projects is syphilis elimination. We also do a lot of things with WIC and infant mortality and Healthy Start and SIDS.

And I think I already did what I wanted to do in Miami was go for a walk on the beach. Got to do that last night. So I'll pass it on.

UNKNOWN SPEAKER: I'm (inaudible) from Iowa Department of Public Health. I have used map for years. When I tried to switch aptitude to archive I found that my (inaudible) didn't go with me over time. So I found easy solution. I went to University of Iowa , found the greatest student and contract them I told them instead of buying the software update every year I just spend this money for greatest student and support them. They were happy. I'm happy, too. So I have been contract with the university for years to finish my mapping needs. I'm not the only one like to eat so I want to eat all kinds of food here. Thanks.

UNKNOWN SPEAKER: Michael (inaudible) with the Michigan Department of Public Health epidemiologist that works with children special healthcare services program and the oral healthcare programs, currently trying to use GIS to help build a medical home model in Michigan . One of the problems with that is all I have is EPI Map so we're trying to clamor for more software if the powers of be let us. We've been pushing now for about a year and a half. So hopefully some of this training will help get that off the ground. I already did the one thing I wanted to do in Miami and that was I ate at Emril's restaurant last night in south beach. I didn't see Ricky Martin or anything. But that's what I have.

KATHY WASSERMAN: I'm Kathy Wasserman, I'm from Washington State . I've used GIS. We have a GIS office in the Department of Health. So I've worked with our technical staff on doing county level maps and have also worked on a project that I'm hoping to do more work on which is looking at access to obstetric care in Washington and distance to providers.

So I have very little actual hands-on experience with any GIS program but have done a lot of consulting thinking work behind it. And one thing I want to do while I'm in Miami is get outside and see the sun and get lots of Vitamin D because we don't have very much of that in the northwest.

POLLY PATEL: Hello my name is Polly Patel and I work with the Michigan Department of Community Health. As Mike said before, we don't really have any ARC View. We only have EPI Map which is what I use to map birth defects. I also work with PRAMS. And one thing I like to do is go out and eat at lots of restaurants.

BRIAN WOODS: Good morning. My name is Brian Woods, New Mexico . I'm an epidemiologist with the Division of Epidemiology and Response with the Substance Abuse Unit. I'm here because I have an opportunity to work with the states environmental public health tracking system. We have an enormous disparities and enormous or I should say super sources of seizium, plutonium, arsenic, we have high concentrations, low birth weights. We're hoping to be able to use the GIS and the information from our vital records and statistics, PRAMS, other sources, to try to begin to map out, identify and perhaps counter-measure some of these unfortunate outcomes. We also have super disparities. We have probably the riches county in the country and the poorest state in the country.

UNKNOWN SPEAKER: Good morning everyone. I'm Brian (inaudible) I'm from the Philadelphia Department of Public Health. We use GIS in a descriptive way, to understand distributions of patterns of disease that we've used it recently to figure out what the burden has been as we've gone from 19 delivery hospitals in Philadelphia down to nine in the past eight years, to see how that has impacted us. But I'd like to get more to an analytic level with GIS. One thing I plan to do while I'm here is hopefully watch the Eagles beat the Seahawks somewhere.

MIKE CURTIS: Hi. My name is Mike Curtis from the State of California Maternal Child Health and Adolescent branch. We've historically used Map Info and we've switched to ARC View in the last year. We've used this mapping data for a lot of descriptive work at the county and census tracking level we've done a little bit of analytical work looking at teen birth weight hot spots at the census we asked to prepare the data is analyze it to throw it back then MapInfo now ARC View. But I feel that we've used it in just a very basic descriptive way to show county level differences. And I think we've had some interesting discussions about how to analyze the data statistically and what's the best way to show the data. Do you want to show statistical significance below the state rate or above the state rate or do you want to show quartiles? It really changes the distribution of the maps. And a lot of times we're struggling with what's the truth, what do we really want to show, what's the best way to show it. So in addition to getting to know the expanded capabilities of ARC View and being able to analyze the data, I know it's a very complex field. I'm hoping to maybe get a little bit of information about some more intermediate level decision making for how best to present data, even if it is descriptive. And I have a cousin here I don't get to see very much and I'm hoping I'll be able to go out to eat and see a little bit of Miami. I've never been here before.

CINDY CHAMBERS: My name is Cindy Chambers. I also work at the California State Health Department MCH branch with Mike. I worked with ARC View just this past year we were using MapInfo earlier. So we are excited to change to ARC View because the rest of the state is using it. So we're on the same page now. I've been one of the few people in our branch that's doing the mapping. So I was looking at breast feeding and actually recently some poverty data by census tract and by county. I'm really excited to learn to do some things at a higher level here. And I'm excited to run on the beach.

UNKNOWN SPEAKER: Good morning, everyone. My name is Angela (inaudible) and I'm with the National Association of County and City Health Officials, NACHO, known to some as nacho. But NACHO. I'm here to learn more about GIS and how local health departments can use this in their work or currently using it in their work. And I have a checklist of things while I'm here in Miami , including walking and running on the beach. Shopping. Somewhere down South Beach . People watching on Ocean Drive and celebrity watching on Ocean Drive . I think I saw Aston Kutcher. So it's like I saw somebody. Watch the sunrise, I had the chance to do that when I got up at 6:00 this morning, and going to eat Cuban food at Lario's on the beach on Ocean Drive . It's a good place, to me it's good because I don't know what real Cuban food tastes like, if anyone knows what the real stuff tastes like they might have a different opinion, but thanks.

UNKNOWN SPEAKER: I'm (inaudible) I'm also with the Maternal and Child Health Project at the National Association of County and City Health Officials. I'm somewhat familiar with ARC View GIS. I was trained on it in graduate school, but I haven't used it to the extent you all have. So I'm looking forward to learning more about it today. And I'm from Miami , so there's nothing specific that I'd like to do. But it's 40 degrees in DC, so just looking forward to being outside in the sun. That's it.

UNKNOWN SPEAKER: My name is (inaudible) since I'm not a state official, I represent no state. I'll pass it on.


UNKNOWN SPEAKER: I've been introduced too. But in case I forgot to mention, in case anybody wants to meet with me, all meetings will be held on the beach. Thanks.


KYLE GARNER: Good morning. My name is Kyle Garner with the Department of Human Services in Springfield , Illinois , and I've been using MapInfo for the past two years. Consider myself kind of a beginner user. Used it to do some descriptive work with our vital records and our WIC and family case management data. And one thing I'd like to do, I'd like to relax a little bit. I've got three kids under three at home.

LACOTA CRUISE: Good morning, my name is Lacota Cruise. I work with the New Jersey Department of Health in senior services. I work with our Maternal Child Health department epidemiology. We've used ARC View to look at prenatal care, utilization, lead screening and we have plans to exploit it further to look at issues of environmental exposures and preterm birth and asthma. While I'm here I look forward to going for a swim in the ocean.

UNKNOWN SPEAKER: I think that marks the end of the introduction.

Now I'll turn it over to Russell Kirby and to Ravi . Thank you.

RUSSELL KIRBY: As you can see, we have a great diversity. I didn't count the number of states, but it's somewhere over 20 states that we have represented. So we're certainly going to cover a large part of the country. One thing I did want to tell you that we're not going to do we're not going to talk about geo coding in this workshop. So if anybody came here for that, write it on your evaluation that you want to have a separate training on geo coding, which actually is a fairly complicated topic. But we decided that we would make the assumption that we're working with data that's already geo coded and do things with it from there. And Ravi is going to momentarily have things up and we'll get started.

On your desktops, there is a folder called GIS-MCH. And in that folder you will find most of the materials that you'll need for the class and there should also be — I think I might let Ravi kind of get us into that, if you'd like to.

RAVI SHARMA: Thank you. If you go on to your desktop you'll see a folder called, I'll use my cursor here and point to it. It's called GIS-MCH. Do you see it? That has all the data that we're going to be using today and tomorrow. And it includes data for ARC Map, for a program called Geo Data Exploratory Spatial Data program from Illinois , Luke Anslon is the architect of that and we're going to use another program called SAT Scan. Michael Kohldorf is the author of that. We have point data for that. And we also are going to be using, actually showing it — let me ask you: Some of you have been talking about measuring poverty level by census track and so on. When you download the data set how do you do it? Do you use the American Fact Finder primarily? How else do you do it? I see one hand that says they find the American Fact Finder. American Fact Finder.

UNKNOWN SPEAKER: We also have — (inaudible) which deals with the same kind of things that (inaudible).

RAVI SHARMA: Yes. So what we're going to do here this afternoon and today we'll show you how to use FTP to download, then process and link the census data to ARC Map and then use ARC Map to symbolize and analyze your data.

And with ARC Map, with FTP it works with Microsoft Access, then you have access to tons of information, once you've downloaded the data set, you then have access to all the variables. And it's much easier then to process the data set and then you can use ARC GIS actually to calculate your rates, to, anything you want. Our GIS has a script called Visual Basic script. We'll show you a little bit of it. Again, before we start showing you this we might get sidetracked. So we'll go it slowly.

So we're just going to get started. We're just going to try to pull my — we're switching computers here. So it takes a few minutes sometimes to get these things to register. Here we go.

So Russ and I are going to talk for a few minutes generally about spatial analysis, and this is very interactive. So please feel free to stop and chat. We're more than happy to do so. So this is a very interactive format. So please feel free to stop me in the middle, just raise your hand or speak up.

So when the people use the term GIS then they use spatial analysis. Now to some extent they do overlap. I specifically use spatial analysis as my term. So there are various ways, methods of looking at geographic patterns in your data and the relationship between features. By features, I simply mean that geographic term feature could be a road. It could be an address. It could be a river. So when I use the word feature, you know, just think of that as a geographic feature. And the relationship between these features. The actual methods that you use such as a map or something more complex, you know, like such you can create spatial data model that combines multiple data sets, and that's the great thing about GIS is not only can you bring original data, but you can use the data that you have in your GIS database, combine it and create new data sets.

And so we are going to look at some of this in a few minutes. Now, I think this is really important. When you get started on any GIS spatial/spatial analysis project. The first thing I think all of you need to do is formulate.

UNKNOWN SPEAKER: I didn't find your PowerPoint in the notebook.

RAVI SHARMA: It should be on the folder. If you go to the folder.

UNKNOWN PERSON it's on the notebook but I didn't find it in there.

RAVI SHARMA: Probably wasn't in there. I think that's one of the late editions. We probably are going to make some copies. Did you find it in your — is it in your ARC Map folder? In the folder on your PC.

UNKNOWN SPEAKER: Yes.

RAVI SHARMA: Yes, we'll give you a hard copy, too.

Okay. So the first thing: Formulate your question. In other words, there's a tendency, very prevalent in public health research and especially if you're an epidemiologist, you probably know it, is you start analyzing your data and then you try to figure out well what does this mean. You know, in other words they're data rich and theory poor. All of you who have gone through any course in philosophy of social science or social science research method know that you really need to start with a theory. Some rudimentary, some idea like the gentleman from, for example, from New Mexico , is talking about environmental pollutant causing certain health outcomes. That is a good theory to start with. And then you ask yourself is there any literature that says that certain environmental, particular kind of environmental pollutants or contaminant lead to some outcomes. It's important to know one can certainly do what we call exploratory analysis in which you're just doing fishing, but it's always good to start with some theory.

So formulate your question, very important. And then the second is once you formulate your question, then you ask yourself: What kind of data do you need to answer your question?

That leads to selection of a method or a combination of methods. Then we go to data processing and then we go to displaying your results, which GIS can give you wonderful capability.

So let's look at first formulate your questions. Start your analysis by being very, very specific. Ask yourself what is it that you need to know. These can be any kinds of questions. I've just put some on the screen here. But you know these could be, you know, from your own experience, I'm sure, as you do work in your departments and in your state you have very specific questions. The one that I have is so many women of child bearing age live within two miles radius of a TRI site, is there a childhood leukemia cluster in my county? What neighborhoods in my county have significantly higher death rate from breast cancer, what do you know about levels of exposure in relationship to distance. Really, that's becoming more precise. Do you know TRI site, do you know what it is? Toxic registry. Maybe I forget your name from New Mexico , do you all know what it is, toxic registry?

Maybe forget your name from New Mexico .

UNKNOWN SPEAKER: Are those possibly the Superfund sites.

RAVI SHARMA: Superfund but it's the sites that the EPA monitors regularly for emissions. These are published on the EPA website which you can go and actually on the EPA website you have the XY coordinates of every TRI site. This is based on the Right to Know Act that was published years ago, and you know then, depending upon which government is in power we sometimes do not get, because these are industries that do emit pollutants. You have those criteria, what we call the criteria pollutants. And these are supposedly monitored on a regular basis, and information is put on a website. They have X and Y coordinates.

UNKNOWN SPEAKER: Are you saying TRI?

RAVI SHARMA: Toxic registry.

UNKNOWN SPEAKER: Toxic.

RAVI SHARMA: Toxic release inventory. TRI. How many of you, not too many of you have heard of that site. It's on the EPA website. So it has all the — and it's yearly and it has — it has, for example, if you live in — I live in Allegheny County and I'm going to show you some data and it has — we have a lot of still, despite the fact that the U.S., the steel industry is pretty well gone out of existence we still have our industry that are listed as having you know emitting pollutants, and they are supposed to report every year to the government how much. And that's put on to the website. It actually has X and Y coordinates. That means the addresses have been geo coded. My, of course, recommendation is don't believe the X and Y coordinates that are on the website because I have, you know, put X and Y data on my map and found that some of those points are supposedly, they're supposed to be within Allegheny County but they tend to be, you know, miles away from Allegheny County. So obviously they're geo coded wrong. So always, if you're going to work for TRI data, always make sure that you do the geo coding yourself just to make sure.

I'm sorry with that. So if you see any of these other abbreviations like TRI or so on, if you don't know what I'm talking about, please stop me.

One of the problems I have is I'm in the School of Public Health so everybody there knows what a TRI site is. So let's summarize specific cities of your question analysis, method, and then presentation.

But based on your question, the kind of hypothesis you might have, the nexus, what kind of data and feature do you need? The type of feature and attribute data that are available that you can create is not always the case. You may have a very specific question in mind for which there is really no data available and you may have to create the data in some way. That always depends on the level of specificity with which you are asking the question. But you can sometimes create data and you're going to give some examples as we go along that shows how you would do that.

Selection of methods: The decision with respect to method will be guided by, one, what is available, and depth of data. One of the problems we have when we try to relate, for example, environmental exposure to adverse pregnancy outcome we're limited to whatever data is available. And that's not necessarily good. So if we really want to do, for example, those of you who are epidemiologist, a case controlled study, we need to go out and actually do the study, determining source, the pollutant and then finding the cases and then the controls. And you know how expensive a study like that could be.

So that's just one example of how the kind of design you might have will influence the kind of data you need.

And then the question is of processing time and effort. For example, the SAT Scan exercise that we have here, I have 56,000 births, and you know the birth weight is allowed 11%. So it took me 26 hours to run that. 26 hours. So my computer, fortunately I have two computers I have one fully dedicated to just GIS work, but it's a multi-color simulation. So it goes through 999 times, I think. And took 26 hours to do, to get the results. So Russ and I decided we will not use the individual point data. So I aggregated the data to you now, what you call — the census track centroid. So we'll do much smaller exercise that you can do in about ten seconds. But just giving an example of how long it does, how long it takes to do a full simulation using SAT Scan, 26 hours. I was — and I have a pretty fast computer. It's the fastest I can get, 3.2 gigahertz. Nothing on it has two gigabytes of memory. Relatively — probably I may need a second processor in that. So the second question is precision of results. If you're statisticians in epidemiology, you know when we talk about precision of result we're talking about what kind of confidence do you want in your result here. You know, you typically talk in terms of 95% level of confidence and so on.

So those kinds of questions still are very important in spatial analysis. On top of that, we have an additional problem of spatial error. Now, spatial error is, to give you a good example, is when you geo code, for example. The geo coding is not precise, and you may have, be off one mile, two miles. You may be off a few miles. And depending on the kind of study you're doing, the precision is also going to affect your results. So for most kind of work we do in public health we don't want that high a level of precision. So we probably get away with a small amount of error in our data but that's up to you to decide how much error can you tolerate and still have a respectable study?

If you're looking for example at infant mortality you might want to map the mortality rates. On the other hand, if a typical industrial plant is being charged with causing a particular disease in your community, for example, you might need more precise and detailed data. This is just an example of where you would like to go depending upon the kind of questions you have and what kind of methods you might select.

Obviously this is where we're going to learn today and tomorrow all the different tools that GIS provides for implementing the selected methods.

You know there are different ways to look at public health data. And MCH data in particular. The data can be discrete. This is where you have — discrete is do you have, is it the low birth weight or not, yes, know, zero one. Or continuous. You can look at preterm as a continuous distribution. So you can have different ways. Or you can aggregate your data sets by polygons, which is areas. Most of the time I think most of you probably work with data that is aggregated to certain geographic areas like a census track, a block, a county.

Now, the point data that you might have like an address, that, of course, is the finest level of detail you can get in your data. So if you have a point data, for example, you have you know vital statistics data. You work with the birth or death files, and you have address of the mother on the birth certificate. And if you have a high level of confidence in that address, you can geo code it using ARC GIS, and you can display it on the map as X and Y coordinate. And that data set gives you a lot more information and a lot more ability to do different things with your data. You can use the data as is, simply point data or you can do cluster analysis. You can, of course, aggregate it to census track level and then relate that data set to other specific social economic demographic variables from the census. So different levels of MCH public health data and different things you can do with your data sets.

So just a little summary of what I was just saying. Discrete data. Geographic features of which locations can be specified. The feature is either present or not present at any given spot. And discrete object is known, and definable boundaries. Now, this is not always true when you are working with environmental data you know sometimes it's not always clear as precisely where the object begins and where it ends. So like environmental pollutant is a continuous surface here. So while it's easy to put these on the screen as being very clear. In practice, when you actually work with this data, the boundaries may not be very clear.

I hope you can see it on your screen. Continuous data or continuous surface represents phenomena where each location on the surface is a measure of concentration level. Or it's relationship from a fixed point in space or from an emitting source. So if you have, for example, most of the counties, especially metropolitan counties, have monitoring stations that monitor criteria pollutants like ozone, SO2 levels and so on. So if you have an 02, for example, if you have ozone concentration measurements for the pollutants for all the monitoring stations in your county, that's a point location. And the level is a continuous level. And you can actually use that continuous measurement of point data to create a continuous map of ozone concentration in your county or surrounding counties. So that's what we mean by continuous. In other words, you take known measurements based on monitoring stations, and you use GIS to interpolate to unknown areas in your county. So that you, therefore, have a continuous map.

So let's see. Let's go to the next one up here. This is the most common kind of data that you probably work with. These are spatially aggregated features. Summary data for various geographic levels from census track. Most commonly all the way up so you may have counts of (inaudible) children, low birth weight babies, childhood leukemia cases, et cetera, by tracks and counties.

Is your data discrete or is it continuous? So when you're modeling or representing and modeling public health features. As a point of before — the boundaries are not clearly continuous or clearly discrete. So you can conceptualize a continuum created with pure discrete data on one end and a pure continuous feature at the other end. And most features probably will fall somewhere between the two extremes. So think of them as polar opposites and most of the (inaudible) features are going to be somewhere at one end of the spectrum.

The decisive factor of where a feature falls on a discrete spectrum is the ease in defining the feature boundaries. If you can define the boundaries of some of these features you deal with, it may be easier for you to define whether it's discrete or continuous.

Just very quickly, talk about two more models. And most of you are probably familiar with it for working in GIS. Actually, these also represent two different kinds of GIS. Although nowadays most GIS have capabilities for working with both data. But typically GIS systems are meant, are based on one or other of these two. One is the vector model, data models. The second is the Raster data model. ARC GIS that you most work with is a vector model. The Raster model, which I don't know if any of you work with a Raster model it's (inaudible) and (inaudible) is a Raster model.

UNKNOWN SPEAKER: What is that?

RAVI SHARMA: IDRISSI. And Ed Rissi. Ed Rissi, he was a traveler, Moroccan traveler. Was he named after him. It's developed and marketed by Clark University . Am I right?

UNKNOWN SPEAKER: That's right. There was also a program called GRAS.

RAVI SHARMA: Yes. Developed by the U.S. Corps of Engineers. That's still probably available. But if you work — so if you, for example, worked with satellite images, then a good system for you, GIS system for you to use, is Ed Rissey or some other Raster-based, but as I pointed out, vector-based system like ARC GIS also now have capabilities for working with Raster data. So nothing — you're not at a loss even if you have vector-based model like ARC GIS. You can still work with spatial data using spatial analyst module that you can buy as an add-on to ARC GIS.

So just to give you — want to talk about — so Russ is going to talk a little bit about the Raster and vector.

RUSSELL KIRBY: And what this is the age-old problem of having a real world and wanting to translate it into the data that you actually have. And basically the vector approach uses points and lines and polygons to represent spatial attributes, and the Raster approach basically converts the entire area into a grid. Typically at very small size. In fact, with remotely-sensed data I understand it's possible to sense locations that are small enough that they would identify individual people, as much as one to five meter squares. In terms of — I'm just going to go very briefly through this, just so you understand what Raster is, basically we divide the entire area into a network of grids and we reference each — we have a different code for each of the grid squares and then their reference in terms of their special coordinates at the corners and that allows you to identify them. So you can collect a huge amount of data over a very small area and potentially not know very much or potentially know a great deal about it, using the Raster approach. A reason why I'm bringing this up, most of our health data are never going to be collected in Raster form. But it's very likely that you might be working with environmental data that had been collected in this fashion because it's possible with remote sensing, for example, if you have a satellite that's going over an area that's had a toxic waste emission, you could potentially capture images of that particular exposure over time and be able to use that in your environmental monitoring modeling in terms of figuring out what the risk is at different locations.

So then the vector approach, we basically, you know, we store our features assets of XY coordinate pairs. If it's a point it's a single XY coordinate pair. If it's a line segment, it's, you know, a string of XY coordinates that define it. If it's a polygon, it would be the set of, there would be the first time would be referenced as the beginning of the polygon and then that coordinate shows up at the end as the last reference. So we get points. We get lines and we get polygons in terms of that. And most of the time when we have health data, this is what we're doing. If we have geo coded individual events or locations, we have a set of points that we're working with. More often what we have is we have geo coded an area, a zip code or a census tract or a county or state as a polygon. Then we reference our data for that.

Then the final thing about this that doesn't relate exactly to the vector versus Raster, but it does to some extent, are the two different major categories of maps that we work with. And the corral pleth map is the map that most of us use. This is a map that shows, basically classifies the units of observation into categories where we assign a value to the entire unit, whatever that might be and then we map across the areas. And in one of our exercises we'll look at a number of different ways that you might classify the values in making the map. The iso-pleth map again is from the Greek isos for equal. This is a map in which we have, from our data points, we interpolate lines of equal value across the map and then we generate the map on that basis. So all of you are familiar with iso-pleth maps, because they're two kinds you frequently see. In fact, I looked at one this morning. I was reading the New York Times in my hotel room before I came down. It has a weather map. It shows the lines for equal value for high temperature today. So that's an example of that. More commonly you see topographic maps that shows lines of equal elevation but you can convert any data that you have in point form you can make into an iso-pleth map rather than a corral pleth map. The problem we have and I've seen it repeatedly when making these kinds of maps for people in public health is that the higher up the administrator, the less willing they are to look at an iso-pleth map rather than a corral pleth map, because they're so tied to thinking about things in the administrative units, you know, the city of New Orleans or Milwaukee County, Wisconsin, and they just have trouble with this kind of a map. But I think it's something that would be much better if we explored more fully, because there's a great deal of our officialality that comes from working with administrative units. And actually there really are no units for reporting health data that make much sense. I still haven't found a disease yet that respects zip code boundaries. In fact, I haven't found a person who respects zip code boundaries.

UNKNOWN PERSON Russ, can you do it backwards? For example, could you do it, memorize the (inaudible) could you do an iso-pleth map and then superimpose for example congressional districts?

RUSSELL KIRBY: Definitely. That would be using the layer function within the GIS. You could definitely make a map like that. In fact when you look at the weather map that's what they do. They have the state boundaries as another map element. So you could definitely do that.

Now, map projections are another thing. We're not going to go into a lot of detail about map projections. That is an assumingly arcane topic for GIS but depending on where you're getting your data from can be a very difficult problem. The problem is that the earth is a three dimensional object. And even though we're typically only mapping things that are the surface of the earth it's basically a three dimension object and we're trying to represent it in two dimensions. So we have problems with that.

Well, there's a variety of different methods that have been developed for projecting map data into the two dimensional form. And the problem comes up, depending on how your particular spatial data have been stored, you might find that they're stored using different projections. When you try to use those data, the GIS will actually do it, but you might end up with something that's completely nonsensical in terms of the information that you have. It's much better to make sure that the map projections are the same for each of the data sets that you're working with.

RAVI SHARMA: Especially if you have different layers.

RUSSELL KIRBY: Exactly. I'm not an advocate of one map projection over another, but I think it's important when you're working with a geography file, in our example, to take a look at the documentation and find out what projection it's in and make sure your other data are also projected in the same way. I don't think we'll go into a lot of — do you have one exercise or —

RAVI SHARMA: We actually do have an exercise that we'll ask you to define your projection and change your projection. And that will come later on. We are going to do hands on. So don't think we're just going to be lecturing.

RUSSELL KIRBY: In fact very soon.

RAVI SHARMA: Yes.

UNKNOWN SPEAKER: Will it affect the longitude and latitude that you get out of (inaudible).

RUSSELL KIRBY: The projection reflects the way that the latitude and longitude are represented on the map. I don't know if we have examples, but if you're familiar with the Mercader projection. Here we have —

RAVI SHARMA: This is an example of —

RUSSELL KIRBY: With the Mercader projection, which is the projection where the equator is in the middle of the map and the lines of longitude are smaller toward the equator and get much wider as you get the poles. So the green line looks as big as all of North America . That's one projection. There's other kinds of projections we call equal area projections that preserve the ability, if you take a square of any particular size that it will be the same across map. And this here just lists the different aspects of map information that might be distorted depending on which projection that you use. And I don't know if I want to go into all the details about this right now. But let you know there are a variety of different issues that come into play and there's no one projection that is the ideal projection for every application. You have to think about in terms of what your specific purpose is to find the one that works best. Typically when we're working with public health data, particularly if it's at a particular state, we are most likely going to want to preserve the equal area function and to some extent also the confirmality, and those are the kinds of things that you typically will want to use. If you're doing a map across the entire United States , you know, there might be some of these other features might be of concern as well.

And of course then the coordinate systems that are used are basically the way we translate the information from the three dimensional earth as it is to the locations on our map. And we typically, when I first started working with automated cartography, it was very rare to have data that was reported in longitude and latitude. In fact, when I worked in Wisconsin , we used the Wisconsin state plane as the coordinate system. That was the coordinate system that was based on the value of zero comma zero was in the very southwestern corner of the state, down by Galena Illinois . And they referenced everything from that particular point. That was great but if you got data from USGS reported in latitude and longitude you had to convert all the information, one way or the other. And it caused all sorts of problems. But nowadays almost all data are reported in latitude and longitude and that's something that's gotten standardized. So that's something that we don't have to worry as much about. But you don't have to use latitude and longitude. What you have to do is make sure that each of the data sources that you have are reported in the same way. And this shows just, I'm hoping everybody knows about coordinate systems with lines of latitude and longitude. I won't go into much detail except to make one little point is that there's no such thing as a wide degree of latitude. All the degrees of latitude have the same distance. But there is a wide degree of longitude. If you want to change that expression, that would be good.

So is that it?

Okay. What I think we'll do we're at the point in time where we said we were going to have a break. So why don't we take about a ten to 15 minute break. We're going to come back and fire up ARC and get to work. Incidentally, there's food, coffee and other things in the back of the room. And I think its okay to bring things to the computer work stations, but please be careful.

RAVI SHARMA: Don't spill anything.

RUSSELL KIRBY: Particularly Coca Cola is not good on keyboards. We'll take about a 10 to 15 minute break.