Meet the ICPSR Director – Ask the ICPSR Director!

Meet the ICPSR Director – Ask the ICPSR Director!


Hi everyone, this is Maggie Levenstein. I am the new… I still feel like it’s very new, though I guess we’re almost three months in, Director of ICPSR. Welcome to our 2016 Data Fair. I’m going to talk today a little bit about who I am and what I’ve been doing and thinking about as the new Director of ICPSR and I look forward to hearing your questions and thoughts about ICPSR’s direction as we move forward. So I’m going to start off first by talking about who I am. Some of you I know already quite well. Some of you have probably never heard of me before so I’ll tell you a little bit about me. Then I’ll turn to talking about the role of ICPSR in the current data environment. As we all know, the data environment is changing rapidly and that creates both opportunities and some challenges for ICPSR and we’re looking forward to working with them together here and working with you. I’ll talk a little bit about our plans for the coming year and then take your questions and hopefully answer them. And you can see we have beautiful pictures of Ann Arbor in the fall. It’s not quite fall, well I guess it is officially fall here, but it’s not… we’re not… the trees are still pretty green. So we are enjoying this, but you can see our beautiful historic Perry Building, Perry School building here. So who am I? I am an economist. I’ve been at the University of Michigan since 1990. I’ve been teaching at the Ross School of Business since about 1998. I have been working at the Institute for Social Research since, I think, 2003 as well. I joined ISR when I became Director of what is now known as the Michigan Federal Statistical Research Data Center and I’ve been Director of that ever since. I still am Director of it now though we are looking for a replacement, if you know anyone who is interested, that job is posted. The Federal Statistical Research Data Centers are our joint projects with the Census Bureau to make restricted/confidential data from the Federal Statistical system available to the research community. While I was at ISR running the FSRDC, I also took on a number of other projects that were basically designed to make confidential and restricted data products available to the research community. Some of these were hosted by what we call the MICDA data enclave, that’s the Michigan Center for the Demography of Aging. And it hosts restricted data products from the Health and Retirement Study and the Panel Survey of Income Dynamics and we work with researchers in order to provide them with secure access to restricted data from those data projects. We also host the Michigan Historical Census project which has researchers here at the University of Michigan using hundred percent count data from the decennial censuses from both 1940 and earlier. And we also host a collaboration with the IEB in Germany where we make restricted German data available, again to the research community here in Ann Arbor and broader. We’re now… actually those, those data are now available at Cornell, and at Berkeley, and at Harvard, so a number of other places in the United States give researchers access to confidential data from the German Social Security System linking survey and administrative data. I’m also a member of the NSF Census Research Network Coordinating Committee. So I’ve been involved in a variety of projects making, mostly confidential data available to the research community in secure computing settings, a variety of different kinds of secure computing settings. Actually one of the things that I’m particularly proud of in the work that we’ve done in all of these projects is that we’ve really managed to find better ways to expand access to the research… [background noise] …excuse me, to the research community. The FSRDCs, the census projects, will require that people come into a secure computing facility in a particular building and we try to make those as nice a place to work as possible. But you really, but you do have to come to that specific space and there are now about 30 of those around the country. But obviously that’s quite limited compared to other ways of accessing data. In MICDA and through the German data enclave we have managed to provide researchers with remote access, so much broader access to restricted data products than they would if they had to come into a physical space. So those are… So that’s what I’ve been working on. That’s very different from most of what happens at ICPSR where we’ve been starting with publicly available data that people could download, people have been able to download from the ICPSR website. Though increasingly ICPSR also has restricted data products that we make available either through a restricted data license or through our virtual data enclave. So, for a variety of reasons, I think we see a lot of overlap between these kinds of projects. A little bit more about my research areas. As I mentioned, I’ve been very involved in what is called the NSF-Census Research Network. The Michigan node of that Network, I put links up to these things in case you’re interested, the Michigan node of the, of what we call the NCRN Network focuses on making naturally occurring or organic data, non-traditional survey data, available to the research community and turning it into useful data products. So one particular project that I’ve been working on that was, that’s been a lot of fun and really broadening for me, has used Twitter data to, has tried to extract information from people’s Tweets to come up with measures of economic activity. Particularly measures of job loss, and job search, job labor demand, and labor supply behavior. And we have that up on the web. You can see our weekly predictions of new unemployment claims and other measures of economic activity. But in general that our NCRN is about extracting information, economic content from various kinds of sources that are not traditional survey data. We also are working… I also run a project which we call CenHRS, which is about linking non-traditional administrative data to survey data. In this particular case, linking Health and Retirement Survey data to the employers of HRS respondents through the administrative data that we have or that the Census Bureau has on those employers in order to address questions such as: “What kinds of employers employ older workers, what kinds of opportunities are there for workers as they age and as their physical and cognitive abilities change?” One thing that I really like about that project is that it places respondents to a survey, to a household survey in a very different social context. Most of the surveys that we have done in the United States over the last century focus on collecting information about individuals and the members of their household. And we’ve spent a lot of time trying to enrich that data, perhaps with information about their neighborhood, about their school, about their community. We have much less information about Americans in their work context even though that is an important part of our social, and health, and overall well-being. And understanding, understanding that employment context is I think, actually extremely important to creating a full view of the social and economic experience of individuals. And that project creates data that will allow people to analyze those respondents in that work context, so it’s a project I really like. I have a couple other projects which are less related to ICPSR type activities. I will do a lot of work on Innovation in the American Midwest and innovation more generally. There’s a link there that I put up which is not about the American Midwest, but is about Innovation in American Universities. And it’s a project that we have here at the University of Michigan and at ISR that is also joint with the Census Bureau and several dozen other universities around the country. I’m collecting and creating new data from administrative data on, on the organization of innovative activities in American universities. I’ve also been involved in a long-term project on patenting and the organization of financing of patenting in the American Midwest from the 1880s through the 1970s. So if you look on my personal website you can find links to those papers. Finally, I do a lot of work on International Competition Policy, on price-fixing conspiracies and cooperation among firms. And I put up a link that I’m going to be giving a talk at a conference in Mannheim next month so I put that up there. So if you want to see what I do in that area you can see that as well. Okay finally, if you’re not completely tired of hearing about Maggie, here’s a picture of me and my family on vacation in Acadia, Maine this summer. My husband, David, is a Professor of Education Policy at Michigan State University. You can see I put that in green. We are a mixed family with both Michigan and Michigan State represented and we, but we do manage to survive that okay. And then we have two young adults, not as young as they used to be, adult daughters. Our older daughter is a graduate student in Public Health at Harvard and our younger daughter is an undergraduate at Barnard College and I am adjusting to having them available only by text. [Laughter] Okay, so what about ICPSR? What’s my vision for ICPSR? What are the challenges of ICPSR? What are the exciting things that we are doing? When I think about ICPSR and what it’s doing now and what it will be doing in the future, I think about what kinds of data researchers want to be using and want to be able to access for their research in the future. And we all know that the kind of day that researchers are using is changing and that means that… Oh sorry I just saw a question, sorry so this is something about adjusting the camera, which I didn’t think I was even… I don’t know, I’ll let you do… I’m going to let Linda help with that and I’m going to keep talking about data for the future. Oh it says it shows the top of my head. I didn’t think it showed me at all, but I’m really glad because the idea of looking… oh look at that. [Linda] There you are. [Laughing] [Maggie] Alright let’s… can we? Sorry about that. See I was telling Linda I put on lipstick and you can’t even see it because you can’t see my mouth. Now looking at, looking at a computer screen which is just slides for… here maybe we can lift the chair, that might actually work or I can just stand. How about that? Okay? [Background voices] Oh there we go. Okay, I’ll have to sit up straight. Okay, [Laughter] so… here can I just move, yeah, let me just kind of move this [Background voices] over here? Because I don’t actually need to look at my face, but there then I can… and then I’ll sit up straight. Okay. Sorry for that slight interruption. We’ll edit that out before we put it on YouTube right Linda? Okay, no I don’t care. Okay, so what are the data that researchers are using now to do social science, to move social science forward, and how do we make sure that that data that researchers are using is available for others? Researchers are spending, investing a lot in creating new data sources. It’s extremely important that that data available, be available to other researchers. In part because researchers need to be able to check one another’s’ data. We all know how important reproducibility is. But it’s also that when you’re, when you make that kind of investment in data creation you want it to get used again. We want to be able to have the research community and social science progress progress based on the investments that are made. So we need to think about the research, the data that researchers are using and make sure that we have good robust ways to preserve it and make it useful to other researchers into the future. So what are the kinds of data that researchers are using that might, that we need to be thinking about, perhaps a little bit differently than we have other data? So I’ve talked so far about social media data, about things like Twitter. In general, there’s a lot of research that goes on today using data that is scraped from the web. Whether it’s scraped from Twitter, which has made a lot of data available, so there’s a lot of research using that. But there’s also, you know, every graduate student I know wants to learn Python and scrape something and then use it for their dissertation, but the web is constantly changing. So what I scraped today is different from what you scrape tomorrow and how do we organize the data that gets scraped and how do we, again make sure that it’s preserved in a way that facilitates robust social science research? So there’s social media and other web based data, there’s administrative data. This is less novel in some ways, researchers have been using administrative data like Social Security earnings data or employment training, food stamps. Those are things that we’ve been using linked to surveys for a long time, but increasingly both the, the increasing computational capacity, the increasing availability of administrative data has meant that this has become more important. As some of you probably know the Murray Ryan Commission on Evidence-based Policymaking is looking at trying to create a clearinghouse for administrative data that would make it available for evidence-based policymaking to inform decision making about policies by local, state, and federal government bodies. Having these kinds of data available to the research community will make that evidence-based policy, based on much more rigorous analysis of the evidence, than it would be the case otherwise. And so we want to think about how to participate in that effort, but we also want to think more generally about how to make sure that administrative data is is available to the research community in a way that protects confidentiality, but that also makes it again the basis for rigorous and robust research. Transactional data increasingly, I come out of a business school so increasingly, we see researchers using data on, whether it’s healthcare transactions, health insurance transactions, sales transactions, how people use, people want to use things from Zillow about real estate transactions. So there’s all kinds of things. These are just the, the possibilities are sort of endless when you think about that, but what it means for how we archive data is really quite, is quite challenging. Sensoring data. How many of you are sitting there with a Fitbit that is telling you to get up, and don’t just sit there and watch Maggie talk, you should be getting up and stretching and walking in circles while you’re listening to me? Well if you’re wearing a sensor… if you’re wearing a Fitbit or some other kinds of sensoring, monitoring device that’s actually really important information about human activity, about human health, your phone keeps track of it. Apple may be analyzing it for their own purposes. There are a lot of researchers who are trying to use data like that to give us better ideas about social interaction, the relationship particularly between social interaction and health, but the quantity of data is really really large. If you’re, you know, measuring what somebody’s doing every second of the day and different things about what they’re doing every second of the day. So the quantity is large and making that useful to researchers, particularly researchers who are not specialized in working with sensor data is actually very challenging. That’s something that we’re working on. Similarly imaging data, that imaging data, some of that is also medical data in terms of medical images, but we also have images from satellites that tell us about whether or not people have tree cover, whether or not people walk, whether there are sidewalks, all kinds of things. Those kinds of images are data that give us an enormous amount of information about the social and environmental context in which people live. Helping researchers to extract information from that kind of data and make it reusable is important. All of these kinds of activities… all of these different kinds of data are often involved in projects in which we’re trying to link one thing to another. So if I have your sensoring data, I have your, I have your Fitbit, I also might ask you some questions about you. Like your age, your Fitbit doesn’t tell me how old you are, but the survey that you took when you set up your Fitbit does. I want to be able to link that, those two to one another. I might even want to be able then to link that, I don’t know, to your Medicare data right, or to your health expenditures. How does your physical activity affect what you cost, your cost to your health insurance provider, your healthcare provider? They’re all kinds of, I don’t want, as I said, I don’t want to know this about you, I want to know this about lots and lots of people in order to understand how this is changing health changes in what an individual does, or how we provide health insurance to change people’s health outcomes and people’s use of medical activities or medical interventions. So we want to, we want to be able to make all these data available. These are actually, I think the kind of fun exciting things about, about data in the future. They’re what people are doing and it’s, it’s what researchers in the field at the cutting edge of empirical social science are doing, and in one of the exciting things about ICPSR now is that it’s our job to help researchers use these data and then to work with those researchers who are doing that to try to lower the cost of them sharing that data with other researchers. Okay. Next slide, oh I’m not [unintelligible]. Okay. Here we go. Alright, so the challenges of doing this… well doing this is, I’ve mentioned some of these things some of these data are computationally just much much larger than what social scientists have traditionally used. And that means we need to have other ways of providing that computational capacity to people. Whether it’s using multiple computers, or using cloud-based computing, or using other kinds of computer infrastructures, that becomes particularly challenging because much of this data is also confidential. If I, you know, the reason I said I don’t want to know your… what’s on your Fitbit is you probably don’t want me to know that it’s your Fitbit and how much you actually got up and walked around today. These kinds of data sources, if I’m doing… if I’m using images from satellite imagery and I’ve got, and I know whether or not people are walking around in a neighborhood, well that actually could compromise someone’s confidentiality, someone’s privacy. We want to be able to extract information from that, we want to be able to use that for social science research, but we also need to do that in a way that protects confidentiality. It’s easier to increase computational capacity if we rely on lots and lots of different computers, but once we do that that actually makes managing the confidentiality harder because they’re more points of access. So we need to balance confidential… confidentiality protection using both, using various ways to protect confidentiality. Including statistical methods, including different methods of computer access, including providing different kinds of, different kinds of data to different researchers. We want to think about that and there’s a lot of research that we are engaged in and that others are, you know, around the world are engaged in in order to improve both data based and computer based methods of protecting confidentiality. And we need to do that in an environment in which researchers want more and more computational capacity. We need to do this in order to provide accessibility to social science researchers and that means, and this is where I think ICPSR’s role is if not unique, I would say it is special right, that it is our job to lead in the development of metadata and metadata standards that facilitates the use of these different kinds of data for research and for reuse and reproducibility. If I take a bunch of Fitbit data and I have to turn it into something which is usable data, and I do a bunch of analysis, and I publish analysis, and somebody else comes along and they take maybe even the same set of Fitbit data, but they aggregate it differently they aggregate it, I don’t know, not by how many steps you took per minute, but by how many steps you took per hour or how many miles you walked per day, right. How are those comparable? We need to be able to make data comparable from one research study to another in order for the studies not just to be kind of one-off, you know, comments on the world. I mean, they add to somebody’s CV, but they don’t actually lead to scientific progress unless we know how these different analyses of the same data speak to one another. So we need to have that kind of reproducibility and that reproducibility is only possible when we have metadata that allows us to compare how data is being used in one study to how it’s being used in another study. Again, however one researcher does it, we also want… when you spend the time going through these kinds of data to make them useful for research, we also want them to make that investment to be able to allow for re-use. So that is really, I think, that is the challenge but also the fun of being at ICPSR in the current period is that it is our job to think about how to design metadata and metadata standards for these new kinds of data. So what are we doing? Well there are several projects that we are working on right now. So this says projects for the (near) future. I am… I think before I even started I had a dozen projects that I thought we were going to start immediately. I am, we’re doing them, I wouldn’t quite say one at a time, but only a few at a time right now. So these are the ones that have, for a variety of reasons, sort of moved to the fore, though we are doing a bunch of different things that I’m pretty excited about. So the first is that we’re working on establishing standardized researcher credentials. We are interested in developing a system which standardizes the credentials of researchers in order to facilitate access to confidential and restricted data. So right now, any of you who have tried to get access to restricted data know that this is often an extremely cumbersome process. It deters researchers from using data and it creates perhaps, unnecessary barriers and privileged access for some people who figured out or who have people who can help them get through those barriers relative to others. So what we’d like, and some of those barriers are absolutely essential to protecting confidentiality or to protecting someone’s property rights, but others are perhaps less necessary but because they make… but because we’re not sure how to best protect confidentiality it’s easy, it’s always easier to protect things by saying “no”. So okay, so we’re trying to establish researcher credentials that are not just for access to data at ICPSR, but access to data that’s restricted from a lot of different users, that will allow researchers who are creating data to say: “Ah well if you have, you know, gold standard credentials you can access this version of the data. If you only have, you know, copper credentials or tin credentials even, you’re an undergraduate doing a project at home over the summer you’re going to use a different version of the data. We may control it differently. You may get access in a different mode, but a researcher who has established a reputation, has training in how to handle confidential data, is working in a secure environment, they’re going to get access to perhaps more confidential data because we know that they know how to manage that. So we want to establish that as a, as a resource for ICPSR data, access to ICPSR data, but also for the research community, in order to facilitate this process of access to restricted data. We’re also working on developing metadata, as I talked about for administrative and linked data. One of the things that we’re very excited about is we’re rolling out a new software platform that will underlie, most of it will be invisible to all of you, but it will underlie all of the ingest, and processing, and dissemination of data at ICPSR. And one of the things it does is to improve our ability to version data, but particularly when you’re, when somebody’s working with administrative data which is coming out of say, I don’t know, food stamp data right, someone works with data that comes out of a data system that was not meant for social science research there’s a lot of data cleaning that goes on. That data cleaning is done by the researcher community. We’d like to be able to have software, metadata software, that allows researchers, as they improve data, as they comment on the data, to crowdsource. To have a community curation of that data, to have those researchers input that data into the metadata that goes with that data set. If researchers are participating in this sort of a, it’s like a Wiki, you know a Data-pedia, Wikipedia of data, right. If researchers are participating in that we also need to be able to version the data. We need to know that this was the version of the data in the metadata that I used, and this is the version of a that you used, and even maybe go back if somebody does something that’s not so great with the data. So our new software platform, which we call Archonnex, will allow for better versioning and we are testing out some different crowdsourcing software that we’re hoping will facilitate this for the research community. We’re also working on a project that will develop metadata for imaging and sensor data, as we talked about. Again, this raises challenges for computational capacity, but creating this metadata for these kinds of things will make them much more useful to a much wider raging, a much wider swath of the researcher community rather than just specialists or people who have a lot of resources which right now, is what it takes to make use of these data. Okay. Supporting data sharing… I’m kind of keeping an eye on the time. Can I go to 1:00, right? Okay, I can go to 1:00, all right. Then I want to get to your questions, but I will get through my slides. So we also are going to obviously continue our support for data sharing, data collection, for reuse and preservation. This is a core activity of ICPSR throughout its lifetime, but it’s increasingly required by funders and by publishers and we believe in this. It is critical for scientific progress, for researchers to share their data and to share it in a way that’s actually useful to other researchers. This is where the process of curating the data, providing useful documentation, and storing it in formats that are, that will be durable into the future and accessible into the future, are things that have really been the core of what ICPSR is about. And we intend to continue this, but also to develop techniques, develop tools, I would say for the research community to make that easier and more accessible and to facilitate the kinds, both the kinds of data and the new kinds of technologies that people are using to analyze data in the future. So this includes software and training to support data sharing, which is critical because it reduces costs to the researchers. I mean, everybody who’s a researcher, who’s created data knows that it’s a pain [laughs] to share the data. And if we think of it just as a mandate, something that you have to do at the end of the project, after your funding has already gone or when the, you know, when you’ve gone through your third round of, you know, revise and resubmits and they finally say: “Okay we’ll take it. Now give us the data.” And you want to do this, oh gosh, at lowest cost possible. Part of what ICPSR’s job is is to work with researchers to help them understand that, understand how best to create their data in a way that, and develop software and tools, to help researchers share their data, document their data, so that it’s useful to other researchers, but also so that it’s much less costly to them to do what they know is the right thing to do, which is to share the data. Or what they may be mandated to do whether or not they believe it’s the right thing to do to share the data later on in the project. It’s much easier if we do that and we give people the tools early on. We also want to lower the cost to researchers of sharing their data. We also want to give researchers an incentive to share their data, and I think you really do that by giving researchers who create data and who share, deposit their data, some feedback about what is happening with their data. So we, everyone who deposits their data can now get usage information about how many downloads of their data there’ve been. We want to improve the information that we’re giving to depositors about how their data is being used. As probably all of you know we currently collect information about data related publications and those are available in the ICPSR Bibliography. You can see who cited your work in their research or who’s used, whether or not they cited it, who’s used your data in their research. We would like to improve the measurement of data usage with what we are, someone will come up with a better name, but I think for now, we’re calling it a Data Impact Factor. So something which you can tell your chair, you can tell your promotion committee, you can tell your funders that the data that I spent all this time and energy collecting, it’s had an impact. Other people have used it, the studies that have used these data have had an influence on the profession, on the thinking in the profession. In order to do this we really need standardized data citations, not just a format but policies on the part of publishers, on the part of journals, on the part of other users of data to cite data. If you’re using someone’s data you should be citing it. So those are the kinds of projects that we have ongoing in the immediate future. And immediate, I am learning, [laughter] is over the next months and years. Though months I like better than years. A couple sort of more really news things that I want to share with all of you is the Archonnex rollout. We launched the new software platform for ICPSR in 2016, with the new version of openICPSR. So if you go to openICPSR.org you can see what that looks like and it’s, it is supported, again you can’t tell this but it’s supported by this new software that we’ve developed here and that we think will provide the basis for improved functionality for lots of ways that we curate and disseminate data. So we’re very excited about that and make it easier for people to deposit as well. Oh and Archonnex, the Archonnex webinar is going to start at one o’clock so if you’re interested in learning more you just stay logged on. Register for that and stay logged on, you can learn more about that. I’d also want to give a pitch, the Resource Center for Minority Data, one of our important data archives here at ICPSR, is looking for a new director. Our director retired then we got another director, and he retired too. So we’re looking for someone who will not retire immediately, but we are looking for someone who is really interested in outreach to both the data producing and the data using community for all kinds of data from minorities. And when we think about that, we think about that quite broadly in terms of race, and gender, and sexual preference, and ethnicity. It’s a great resource and has really improved data access for the research community and people’s awareness of the availability of data. So we’re really excited about that. If you’re interested or know anyone who’s interested, please let them know. Actually, I don’t have the link up, but within the week we should have a link up as well. We will be looking for a new Associate Director for ICPSR. So look for that coming as well. All right, questions for me. I have questions that came in before the webinar. [Background voice] Oh good, because I have old eyes. I’m not ready to retire yet, but I definitely [laughts] do not wear… well whatever the opposite of bifocals is. So okay alright. I think those are all… so “What’s been my biggest surprise since beginning at ICPSR?” I would say my biggest surprise is how much energy, and enthusiasm, and how hard-working people at ICPSR are. I mean I’d interacted over the years with a small number of the people at ICPSR who would, I worked with on different projects, but there were a lot of people so, probably ten percent of the people at ICPSR, and now I think I’ve met pretty much, you know, all of our hundred plus employees. And I would say the enthusiasm, people are just, I’m really impressed by the enthusiasm and the knowledge that people have. And I guess, I mean I guess I don’t know if that should have been a surprise, but I didn’t know and so that’s been really really nice. And so now I’m going to turn to the some of the questions that came in. Some of them I think I’ve already… “How am I liking the job so far?” It’s a lot of fun. It’s really a great group of people. So for those of you who have never been to Ann Arbor, who have never met ICPSR staff in person, I know some of you are ORs, and the next time we have a meeting of the ORs if you get the chance, please do come to that meeting. I know we’re planning also some regional meetings for ORs. I will give you a chance to meet some of us in person and those are just… it’s a great group of people and it’s really… so I urge you to participate in that. “How can I find out if a deposit I made to ICPSR is making a difference in the research community?” So one thing you can do right now is to look on the website and you can find out how many times a particular data set has been downloaded, how many times it’s been cited in publications. So those are available through the study page for every ICPSR data set. So you can know that now. If you have questions about other things you’d like to know about your data, about ways that you’d like to have the impact of your data measured. Email us, email me, email Linda, email any of us at ICPSR and let us know because we’re trying to think about what are the best measurements. What is the impact that we want, that we want to measure, and what the best way of doing that. I have some ideas and other people here do too, but we’re interested in your ideas. “How can ICPSR members best support ICPSR this next year?” Well I think there are a couple ways. One is you can let people know about ICPSR, both about the Summer Program. I mean, for those of you who don’t know, I’ve been talking mostly about our data and I have to say most… I had given guest lectures in the Summer Program over the years, but I think most people, if you go around the world you talk to people who say, “Oh I spent a summer in Ann Arbor it was, you know, it changed my life, it was the best time I ever had.” So for those of you who’ve never been to a Summer Program course, check out the Summer Program course. Let other people know about it, both undergraduates and graduate students. And young faculty and older faculty too who want to prevent Alzheimer’s by learning new things [Laughs] should think about coming to the Summer Program. As I mentioned we’re going to have some regional meetings for the ORs and others to learn more about ICPSR. Some of us have taken that on the road. Those are ways you can engage with ICPSR. We have the YouYube channel and webinars so you can learn more about that. You can use ICPSR data in your classroom. I know there are a bunch of really cool empirical learning tools that we have, what did we call them digital guides? [background voice] Data-driven… [Maggie] Data-driven Learning Guides, you can see I’m new. Data-driven Learning Guides that are up there that make less work for you when you have to come up with new assignments. But things that are more interesting, and more useful, more kind of hands-on learning, which we all know improves retention. So hands-on learning for your students. Those are all ways that you can engage with ICPSR in the coming year. If you’re interested or have thoughts about any of the kinds of things that we’ve been talking about. About new kinds of data or data credentialing, email us and we’d be happy to have you participate. For those of you who are involved in data collection, we actually have advisory committees for many of our archives that involve researchers who are involved in producing and using data. Those are more senior researchers. Things, you know, the people who if you don’t want to come spend the summer and Ann Arbor, but you’d like to give us some advice about how about how we are curating different kinds of data. We always have lots of ways for people to give feedback on that. Other things? “Is there a way that membership can have a more direct interaction with ICPSR besides data depositing and data usage?” Well there’s depositing, there’s usage, there’s participating in our webinars, there’s participating in our classes, and as I said, in our regional fairs. If you’re not an OR and you’d like to learn about being an OR, let your local OR know, they probably want to have a backup. Other ways… So those are all ways that you can engage with us and with almost all of our, especially our topical archives. If you have questions about the data there are actually specialists on staff who are available to help with that as well. But if there are things that you’d like to do to improve the data we’re also present at most national and international conferences that are engaged in both data archiving and in empirical social science. You can come and visit us when we have our booths at different fairs. Talk to us about the data that are available. So those are all ideas. Another question that’s come, “Is there an overall… Ah oh that’s a good point. See I have help here because, believe me, I am not doing this all on my own, not even close. So another suggestion is to let us know about data that’s useful that we should encourage for deposit. I will tell you I’ve done this, you know, somebody I think has really valuable data and I say, “Go get it out of your garage and ship it to us, you know?” And we can, yes, can we get it off of those floppy disks from the 1970s? The answer is “yes”. I don’t know if people here want me to tell you that, but the answer is “yes”. If you have valuable data or you know someone who has valuable data that should be made available, should be preserved for the research community, send it our way. That is what we are here for. “Is there an overall annual percentage of data downloads, publications, or classroom instruction?” So how many downloads do we have of data from ICPSR every year? [Linda] Over a million data sets downloaded last year with 52,000+ active accounts. [Maggie] So the answer to that again, see I wouldn’t know, we’ve had over a million data sets downloaded in the past year by over 52,000 unique data users. So more than 50,000 people downloading over a million datasets. I don’t know, off the top of my head, if we have any way of keeping track of how many people have used the data for classroom instruction. Do we? [Linda] We have measures of how many people have accessed it on campus. [Maggie] Ah good point, so we have how many people have accessed our data from different campuses and we have how many people have accessed the Data- driven Learning Guides themselves, and the publications again, I don’t know how many… We could probably, do we know how many publications we’ve added to the bibliography in the last year? [Linda] We do, but I don’t have those stats with me. [Maggie] I don’t know off the top of my head, but we have over 70,000 publications, data related publications. Those are publications that use, that analyze ICPSR data sets in those publications. There are over 70,000 of them, of citations of those in our bibliography, up on our website now. So other questions? I think ICPSR it’s actually, I think, one of the questions just kind of, “How do I feel about ICPSR’s future and the health of its future?” I think ICPSR is, I’ve actually been really struck by how strong the ICPSR Membership is. It is a, I think it’s a really special community. We don’t talk about it that way, but it’s of universities and research institutions that recognize the value of data, and of sharing data, and of preserving data. And I think… so hats off to Linda, our Membership Director, but also to the social science community, the behavior and social science community that recognizes the importance of this and continues to support ICPSR and to support the growth of data sharing and data preservation for the future. We know that this is critical to scientific advance, to improvements in our understanding of the world in which we live, and it makes you hopeful, it makes you proud to be part of an organization and part of a community that believes in this. So that’s what I would say about the future of ICPSR, and of social science research, and empirical social science research into the future. Anything else? Other questions? I think that’s all right and I don’t see anything else coming in. I will wait, I know sometimes people are nervous about sending in questions. I’ll give you one second before we sign off, but you might want to take a few minutes before you sign back on to hear more about Archonnex. Thank you all so much for joining and for participating in this community. [Linda] And a big special thanks to Maggie for launching. It’s always hard to be the first one up and she handles it like a pro. It is hard to believe that she’s just been with us since July 1st, so we thank her for being here and thank you all for tuning in. Indeed as Maggie mentioned, at one o’clock we will be talking about Archonnex, which is our new architecture for the bringing in data and releasing it to you. Lots of good information there for those that are technical and those that are, like me, that are not so technical. So don’t be afraid and come visit us at one o’clock. And then we also have a two o’clock Orientation to ICPSR as well as some information on SEAD, which is again really important for researchers, a good gateway for them to produce data and move into that crowdsourcing arena to get easier metadata into us so it can stand the test of time. So with that, it looks like our questions are ended. And we will once again thank Maggie and wish you a good afternoon. Thank you.

You May Also Like

About the Author: Oren Garnes

Leave a Reply

Your email address will not be published. Required fields are marked *