Skip to content

Instantly share code, notes, and snippets.

@pipitone
Last active August 29, 2015 14:00
Show Gist options
  • Save pipitone/11151894 to your computer and use it in GitHub Desktop.
Save pipitone/11151894 to your computer and use it in GitHub Desktop.
swcarpentry sorting hat
Timestamp Your Name: Email address: What is your career stage? What is your discipline? In three sentences or less, please describe your current field of work or your research question. What OS will you use on the laptop you bring to the workshop? With which programming languages, if any, could you write a program from scratch which imports some data and calculates mean and standard deviation of that data? What best describes how often you currently program? What best describes the complexity of your programming? (Choose all that apply.) A tab-delimited file has two columns showing the date and the highest temperature on that day. Write a program to produce a graph showing the average highest temperature for each month. How familiar are you with Git version control? Consider this task: given the URL for a project on GitHub, check out a working copy of that project, add a file called notes.txt, and commit the change. How familiar are you with unit testing and code coverage? Consider this task: given a 200-line function to test, write half a dozen tests using a unit testing framework and use code coverage to check that they exercise every line of the function. How familiar are you with SQL? Consider this task: a database has two tables: Scientist and Lab. Scientist's columns are the scientist's user ID, name, and email address; Lab's columns are lab IDs, lab names, and scientist IDs. Write an SQL statement that outputs the number of scientists in each lab. How familiar do you think you are with the command line? How would you solve this problem: A directory contains 1000 text files. Create a list of all files that contain the word "Drosophila" and save the result to a file called results.txt. How old are you? What is your gender? What is your ethnicity? (Choose all that apply) Please briefly describe the project you plan to work on for the last two days and what you would like to accomplish on it during that time. Please briefly describe the project you plan to work on during the last two days of the workshop and what you'd like to accomplish on it during that time.
10/28/2013 16:30:39 person1 person1@example.com Support Staff Life science (ecology, zoology, botany), Economics, Tech support, lab tech, or support programmer Linux R, Python I program once a month. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am familiar with Git because I have used or am using it. I could complete the task with documentation or search engine help. I am familiar with unit testing or code coverage but have never used it. I am familiar with SQL because I have used or am using them. I could complete the task with little or no documentation or search engine help. I am familiar with the command line because I have used or am using it. I could not create this list. 35 - 44 Female Caucasian/White helping our supported projects make progress
10/28/2013 16:33:03 person2 person2@example.com Post-doc Life science (ecology, zoology, botany) Linux C++, Python, R, Ruby I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am familiar with Git because I have used or am using it. I could complete the task with little or no documentation or search engine help. I am familiar with unit testing or code coverage because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 25 - 34 Male Caucasian/White phyloGenerator', a Python pipeline for making phylogenies from DNA data. The code is very old, and I would like to make it easier to read, and incorporate a few new features.
10/28/2013 16:35:42 person3 person3@example.com Post-master's research faculty Life science (ecology, zoology, botany) Windows R I program several times a year. I have never programmed., I write scripts to analyze data. I could complete the task with documentation or search engine help. I am familiar with Git but have never used it. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line but have never used it. 25 - 34 Female Caucasian/White I would like to create a way to dynamically interact with pdf documents, google earth (or other map images), and simple R functions to display various types of quantitative data.
10/28/2013 16:38:50 person4 person4@example.com Faculty Life science (ecology, zoology, botany) Apple OS X R, Python, Mathematica I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. Prefer not to say Prefer not to say Prefer not to say A model that combines human risk perception into a simple earth system model.
10/28/2013 16:43:58 person5 person5@example.com Post-doc Geography Windows R I program once a month. I write scripts to analyze data. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line but have never used it. 25 - 34 Male Caucasian/White I plan to work on a continental scale analysis of forest land use changes. This could include geographic data processing (Python) and analysis (r).I hope to become a more competent programmer in both.
10/28/2013 16:54:30 person6 person6@example.com Graduate Earth sciences (geology, oceanography, meteorology), Life science (ecology, zoology, botany) Windows R, C, IDL I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 25 - 34 Male Asian/Asian American I would like to see a more professional way of doing programming since most of time I am writing code for my own study.
10/28/2013 17:04:21 person7 person7@example.com Post-doc Life science (ecology, zoology, botany) Windows R I program several times a year. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 35 - 44 Male Latino(a) Lear how to use repast. A java library for using Agents based models
10/28/2013 18:18:42 person8 person8@example.com Faculty Life science (ecology, zoology, botany), Humanities and social sciences Windows Matlab, R I program once a month. I write scripts to analyze data., I write tools to use and that others can use., I am part of a team which develops software. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using a pipeline of command line programs. 25 - 34 Male Caucasian/White Parallelizing existing code for a simulation model written in Matlab.
10/28/2013 18:22:51 person9 person9@example.com Graduate Life science (ecology, zoology, botany) Apple OS X R, Python (rusty though) I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar with the command line but have never used it. 25 - 34 Male Caucasian/White I will be joining the SESYNC MPA Performance Working Group in November and will try to do work for this group at the Software Carpentry workshop. I'm not entirely sure what this will entail... Alternative ideas include: creation of a forest cover layer from aerial imagery, development of tool to query an online database and import it into my own database, learn to build databases that organize PDF files along with other data, put my database online and develop interface for exploring it.
10/28/2013 19:20:55 person10 person10@example.com Post-doc Earth sciences (geology, oceanography, meteorology), Life science (ecology, zoology, botany), Life science (biology, genetics) Windows R, maybe python (new to it) I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar only with the term "command line". 35 - 44 Female Caucasian/White several possible; would like to improve a loop to modify satellite-derived global crop area extent at the .05x.05 degree scale, based on bottom-up reporting of harvested areas.
10/28/2013 20:13:56 person11 person11@example.com Post-doc Life science (ecology, zoology, botany) Windows R, WinBUGS I program less than one a year. I write scripts to analyze data. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar only with the name SQL. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 35 - 44 Male Caucasian/White In mid-November I will attend a SESYNC conference focused on developing a framework to guide water resources management decision-making that accounts for climate change (uncertainty) and the ecosystem requirements. In the last two days of the workshop, I hope to apply the framework we define in the workshop with real world examples, probably from well-studied systems where data are available.
10/29/2013 7:44:47 person12 person12@example.com Faculty Life science (ecology, zoology, botany) Apple OS X R I program once a week or more. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 35 - 44 Female Caucasian/White I am not sure yet. It may be a project related to my Sesync working group or it may relate to an upcoming manuscript exploring biogeographic distribution of plants with different growth forms. In either case, it would be using R and incorporating phylogenetic analyses, especially logistic PGLS.
10/29/2013 9:26:20 person13 person13@example.com Faculty Earth sciences (geology, oceanography, meteorology) Apple OS X Matlab, Python I program several times a year. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 35 - 44 Male Caucasian/White Analyzing data collected from disparate instruments, including using regular expressions to merge data files then filter, sort, and clean the data. Hoping to be come more comfortable in R and Python (I use Matlab primarily now). I would also like to learn visualization techniques in R and Python.
10/29/2013 10:38:55 person14 person14@example.com Graduate Life science (ecology, zoology, botany) Apple OS X R I program once a week or more. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar only with the name SQL. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 25 - 34 Male Caucasian/White An analysis of spatial point and associated trait data for trees in a Smithsonian Center for Tropical Forest Science 50 hectare plot. Would compare observed patterns with stochastic and various deterministic models. Am also interested in compiling trait data from published floras for the Malay Archipelago. Would be interested in creating a data gathering pipeline and online database.
10/29/2013 11:52:17 person15 person15@example.com Graduate Life science (ecology, zoology, botany) Windows R, Mathmatica, Matlab I program once a week or more. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am familiar only with the terms "unit testing" and "code coverage". I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 25 - 34 Male Caucasian/White I hope to improve my hierarchical modeling programs so that they are more streamlined and efficient. I would like to learn how to ensure that the code is always functioning properly, especially with large spreadsheet manipulation.
10/29/2013 12:20:55 person16 person16@example.com Data curation specialist Open data outreach and education; support Apple OS X R, Matlab I program less than one a year. I write scripts to analyze data., I am part of a team which develops software. I could complete the task with documentation or search engine help. I am familiar with Git but have never used it. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 25 - 34 Female Caucasian/White My big goal is to better understand how to help facilitate better research practices, especially as it relates to scientific programming. I am not sure what project I will focus on - if the timing works out, I have some survey data that I would like to analyze using scripts rather than old-fashioned excel. I would really like to be more familiar with Git since I'm working on a few projects here at CDL (as a project manager, NOT a programmer) that might benefit from Github. I'm also very interested in understanding SQL so I can better discuss it with researchers.
10/29/2013 17:08:06 person17 person17@example.com Graduate Life science (ecology, zoology, botany) Linux R I program once a week or more. I write scripts to analyze data. I could complete the task with documentation or search engine help. I am familiar with Git because I have used or am using it. I could not complete this task. I am not familiar with unit testing or code coverage. I am familiar only with the name SQL. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 25 - 34 Male Caucasian/White I am currently composing a project for an econometrics course in which I am regressing ecological footprint per capita on an array of economic and demographic variables, using World Bank data. I am hoping to use this as a platform for learning how to set up and execute a fluid workflow model, from database compilation to scripting and versioning to data analysis and automatic production of reports (using knitr, Sweave, RMarkdown, or something along these lines) that allow for reproducible research. (I could also use some guidance in strategies for using the shell to improve my workflow, and to dependably back up my work.)
10/30/2013 8:30:44 person18 person18@example.com Graduate Physics Linux C++, C I program once a week or more. I am part of a team which develops software. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am familiar with the command line because I have used or am using it. I would create this list using a pipeline of command line programs. 25 - 34 Male Caucasian/White I would like to work on a project involving python and/or SQL. I hope that the project I work on will help to give me basic competency in these areas.
10/30/2013 18:00:10 person19 person19@example.com Post-doc Life science (ecology, zoology, botany) Windows R, Matlab I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar only with the name SQL. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 35 - 44 Male Caucasian/White I haven’t decided for sure yet. Likely one of two projects: (1) implementing a maximum likelihood or hierarchical Bayesian model of mayfly population dynamics under different predation regimes, or (2) update and convert a set of functions that I have already written in R, into an “official” contributed R package via CRAN – or at least learn more about how to do this (I have no idea how to do this, but it seems that this workshop might be a good place to get some good help on this).
11/1/2013 10:33:28 person20 person20@example.com Post-doc Life science (biology, genetics), Brain and neurosciences Windows Matlab, R, Python I program once a week or more. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 25 - 34 Male Caucasian/White Unfortunately, I will only be able to attend the first two days of the workshop.
11/1/2013 14:07:01 person21 person21@example.com Faculty Economics Windows Command languages only (e.g., SAS) I program less than one a year. I write scripts to analyze data., I am part of a team which develops software. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 45 - 54 Female Caucasian/White I will be working with Anna McMurray on these goals: Develop a computer program that would allow a user to extract data from public twitter accounts, using keywords contained within tweets and profile information such as location and socio-demographic information. The goal is to explore patterns in the use and value of selected ecosystem services. As one example, we wish to explore information on people’s outdoor recreational preferences (where they go, how far they travel, what activities people engage in at a given natural area, etc.). If this is not feasible, we will be working with a time-series, cross-sectional database of the Potomac River. The database is large because it covers a long time period and a wide range of variables. Because data were extracted from multiple sources, some data cleaning is needed to create a consistent dataset in which time periods are aligned across variables. The goal with the analysis is to statistically test hypotheses related to relationships between stressors and key ecosystem responses. Within the workshop, we aim to apply simple regression models and multi-variate techniques to explore the data structure - prior to final statistical analysis.
11/1/2013 14:12:54 person22 person22@example.com Faculty Research Assistant Economics Windows None I have never programmed. I have never programmed. I could not complete this task. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 25 - 34 Female Caucasian/White I would like to develop a computer program that would allow me to extract data from public twitter accounts, using keywords contained within tweets and profile information such as location and socio-demographic information. The goal is to explore patterns in the use and value of selected ecosystem services. As one example, we wish to explore information on people’s outdoor recreational preferences (where they go, how far they travel, what activities people engage in at a given natural area, etc.). If this is not feasible, I will be working with a time-series, cross-sectional database of the Potomac River. The database is large because it covers a long time period and a wide range of variables. Because data were extracted from multiple sources, some data cleaning is needed to create a consistent dataset in which time periods are aligned across variables. The goal with the analysis is to statistically test hypotheses related to relationships between stressors and key ecosystem responses. Within the workshop, we aim to apply simple regression models and multi-variate techniques to explore the data structure - prior to final statistical analysis.
11/1/2013 14:24:57 person23 person23@example.com Post-doc Humanities and social sciences Apple OS X R I program several times a year. I write scripts to analyze data. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am not familiar with the command line. 25 - 34 Male Caucasian/White I would like to work with the USAID DHS survey database that SESYNC has prepared for one of its funded projects, which I am participating in. We are investigating the links between conservation strategies, environmental conditions, and human health/well-being. I would like to apply some of the skills I learn during the workshop to ensure effective collaboration with my colleagues on the project as we work across R, SQL, and ArcGIS.
11/1/2013 14:30:58 person24 person24@example.com Faculty Life science (ecology, zoology, botany), Life science (biology, genetics) Apple OS X R I program once a month. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 45 - 54 Female Caucasian/White Both of these projects are candidates.
11/1/2013 14:31:10 person25 person25@example.com non-profit Life science (ecology, zoology, botany), Humanities and social sciences Windows none I have never programmed. I have never programmed. I could not complete this task. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am familiar only with the term "command line". 25 - 34 Female Caucasian/White I think that my partner, Julie Ekstrom, will ultimately be deciding what component of the project that we work on during the last few days. She and I have not discussed this yet.
11/1/2013 14:41:42 person26 person26@example.com Post-doc Life science (ecology, zoology, botany) Windows Probably R, Python, but not just off the top of my head I program several times a year. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar with SQL because I have used or am using them. I could complete the task with little or no documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 25 - 34 Female Caucasian/White No definite plan, but I would be interested in optimizing a function written in R to improve performance and/or writing code that automates tasks in R including dynamically accessing data files and writing output files or manipulating output format - possibly incorporating different file types & visualizing data.
11/1/2013 14:43:45 person27 person27@example.com Faculty Life science (biology, genetics) Windows Python, R I program several times a year. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL because I have used or am using them. I could not complete this task. I am familiar with the command line because I have used or am using it. I would create this list using a pipeline of command line programs. 25 - 34 Male Caucasian/White I would like to improve my basic skills in scripting. I often use other script based programs on the command line, but am left with files that need to manipulated, cleaned up, concatenated, etc. I am thinking that I would like to work on downstream analysis of a genomic dataset, writing efficient scripts for calculating relevant pop. gen (or other )statistics, using re-sampling methods, bootstrapping, etc. Integration into graphics programs like R would be ideal.
11/1/2013 14:54:52 person28 person28@example.com Graduate Earth sciences (geology, oceanography, meteorology), Life science (ecology, zoology, botany) Windows MATLAB I program several times a year. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am not familiar with the command line. 25 - 34 Female Caucasian/White I plan to bring stage (depth) data that is logged automatically in each of my nine stream sites. My plan is to learn to improve the existing code I have written in MATLAB to process these data (coarsen the data via linear interpolation and otherwise clean up the data), or learn enough R code to replicate what I have done in the R program.
11/1/2013 15:37:05 person29 person29@example.com Graduate Physics Apple OS X C, C++ a little Python I program once a week or more. I write scripts to analyze data. I could complete the task with documentation or search engine help. I am familiar with Git because I have used or am using it. I could complete the task with little or no documentation or search engine help. I am familiar only with the terms "unit testing" and "code coverage". I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using a pipeline of command line programs. Prefer not to say Prefer not to say Prefer not to say I use scripting and C++ a lot in my day to day. So I would say I have a lot of experience coding. But I have never taken a formal class on them, and every now and then something pops up, and I'll be like wow, how did I not know that. So I'm hoping this workshop will quickly cover all the fundamentals, and then get into some of the more advanced tricks. I also have a little python experience, more would be even better.
11/3/2013 20:53:36 person30 person30@example.com Post-doc Life science (ecology, zoology, botany) Apple OS X R, Matlab I program once a month. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using "Find in Files" and "copy and paste." 25 - 34 Female Caucasian/White Work on individual-based, spatially explicit simulation model describing species distributions and functional traits. If possible, work on how to import and work with large databases (perhaps BCI tree data set) in R by doing analyses similar to that of the simulation model.
11/4/2013 6:52:58 person31 person31@example.com Post-doc Life science (ecology, zoology, botany) Windows none - maybe R I program less than one a year. I have never programmed. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am not familiar with the command line. 25 - 34 Female mixed Three different potential projects (depending on state of data in hand an what is the feasible). (1) Using statistical matching methods to determine match sites in MPA's to control sites for MPA's. (2) Looking at a a suite of variables to determine what variable are driving fish and benthic populations (3) working on combing databases together, writing queries, building QAQC for databases
11/4/2013 8:57:12 person32 person32@example.com Post-doc Life science (ecology, zoology, botany) Windows R I program several times a year. I write scripts to analyze data. I could not complete this task. I am familiar with Git but have never used it. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I could not create this list. 45 - 54 Female Caucasian/White Hmmm.... I'll have to think about that.
11/4/2013 11:17:02 person33 person33@example.com Faculty Research Assistant Humanities and social sciences Windows none I have never programmed. I have never programmed. I could not complete this task. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am not familiar with the command line. 25 - 34 Female Caucasian/White I plan to work on the demographic assessment of SESYNC participants and to utilize the skills learned from this workshop to assist with this.
11/6/2013 9:50:09 person34 person34@example.com Support Staff Tech support, lab tech, or support programmer Apple OS X C I program less than one a year. I write scripts to analyze data. I could complete the task with documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using a pipeline of command line programs. 25 - 34 Male Caucasian/White No Set project, taking class to gain knowledge in tools used by our users and our visitors.
11/6/2013 9:59:03 person35 person35@example.com Faculty Earth sciences (geology, oceanography, meteorology), Geographical Sciences Windows R, IDL (some experience with C, C++, Python) I program several times a year. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am familiar only with the name Git. I am familiar with unit testing or code coverage but have never used it. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using a pipeline of command line programs. 25 - 34 Female Asian/Asian American I would like to make an animated plot in R or Python to display time series data and maybe learn how to make a GUI for an existing program. (or something on these lines)
11/6/2013 10:30:50 person36 person36@example.com Graduate Life science (ecology, zoology, botany) Apple OS X R I program once a week or more. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am familiar with the command line but have never used it. 25 - 34 Male Caucasian/White I would like to practice and become comfortable with integrating the use of the shell and python with R.
11/6/2013 16:52:36 person37 person37@example.com Graduate Earth sciences (geology, oceanography, meteorology) Apple OS X R, IDL I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 25 - 34 Female Caucasian/White I've written an algorithm to automatically extract individual tree crowns from lidar datasets - it would be good to implement some version control on this algorithm or maybe make some parts of it more efficient by implementing some freeware.
11/7/2013 15:23:15 person38 person38@example.com Post-doc Life science (ecology, zoology, botany) Windows none I program less than one a year. I have never programmed. I could not complete this task. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar only with the name SQL. I am familiar with the command line but have never used it. 45 - 54 Male Caucasian/White, Latino(a) I want to use cross-national survey data from multiple years (e.g. Albania 1992, 1999; Kenya 1992) on household-level income generating activities ("livelihoods") to test hypotheses about how the variety, distinctiveness, and relative proportion of livelihoods changes with respect to socio-economic development. During those two days, I would love to learn how to load relevant files or parts of files into R and to do some preliminary analyses (does the relative proportion of activities rise with indices of development? Does the diversity differ between national and sub-national scales?
11/7/2013 15:58:17 person39 person39@example.com fellow (after postdoc) Life science (ecology, zoology, botany), Humanities and social sciences Windows matlab I program less than one a year. I have never programmed., I write scripts to analyze data. I could complete the task with documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar with the command line but have never used it. 35 - 44 Female Caucasian/White How to associate data to coastal segments and generate several scenarios of different indices (to do a sensitivity test to see how different the index options are)
11/8/2013 13:13:27 person40 person40@example.com Graduate Geography Windows Python, C++, C, Java, etc. I program once a week or more. I am part of a team which develops software. I could complete the task with documentation or search engine help. I am familiar with Git but have never used it. I am familiar with unit testing or code coverage because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with SQL because I have used or am using them. I could complete the task with documentation or search engine help. I am familiar with the command line because I have used or am using it. I would create this list using a pipeline of command line programs. 18 - 24 Female Caucasian/White, Latino(a) Still thinking about it.
11/9/2013 23:23:04 person41 person41@example.com Graduate Earth sciences (geology, oceanography, meteorology), Life science (ecology, zoology, botany) Windows R, C++, Matlab, SAS I program once a week or more. I write scripts to analyze data. I could complete the task with little or no documentation or search engine help. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 25 - 34 Male Caucasian/White I have a stream bacterial genetics dataset produced using Illumina MiSeq sequencing. The processing of this data is conducted in Python. I would like to have at least an understanding of how to properly execute the code I'll be working with.
11/15/2013 13:47:46 person42 person42@example.com Graduate Physics Windows R, Python, C++, Matlab I program once a week or more. I write scripts to analyze data., I write tools to use and that others can use. I could complete the task with little or no documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar with SQL but have never used it. I am familiar with the command line because I have used or am using it. I would create this list using basic command line programs. 18 - 24 Male Caucasian/White If I could get a good system for version control up and running that would be great. We've also implemented a tiny bit of automated testing but it's severely lacking, and I'd like to improve that. Finally, if I could improve the software architecture (or at least work on a plan and roadmap to improve our software architecture) that'd be great too.
person43 person43@example.com support staff
11/23/2013 12:27:39 person44 person44@example.com Graduate Life science (ecology, zoology, botany) Windows none I have never programmed. I have never programmed. I could not complete this task. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am familiar only with the name SQL. I am familiar with the command line but have never used it. 25 - 34 Female Latino(a) The specifics is still to be determine but it will involve using R and or other remote sensing programs to extract wavelength data and vegetation.
11/25/2013 11:15:04 person45 person45@example.com Graduate Life science (ecology, zoology, botany) Windows R I program several times a year. I write scripts to analyze data. I could complete the task with documentation or search engine help. I am familiar with Git but have never used it. I am familiar only with the terms "unit testing" and "code coverage". I am familiar only with the name SQL. I am familiar with the command line because I have used or am using it. 35 - 44 Male Caucasian/White My work involves a series of spatial analyses both within and among thousands of headwater systems across a large region. Considering the incredible number of processes involved, I am most interested in learning to program for task automation.
11/29/2013 10:53:04 person46 person46@example.com Graduate Earth sciences (geology, oceanography, meteorology), Life science (ecology, zoology, botany) Windows none I have never programmed. I have never programmed. I could not complete this task. I am not familiar with Git. I am not familiar with unit testing or code coverage. I am not familiar with SQL. I am familiar only with the term "command line". 25 - 34 Male Caucasian/White The LIDAR data I will be using for my research is originally stored in ASCII format in many separate text files. I will be using a program called LIDAR FUSION to combine these files, then convert them to a format to create a continuous digital elevation model for Maryland. I would like to build script functions that will allow me to efficiently combine and convert these text files to the appropriate format to run FUSION functions.
import sys, csv
"""
Usage: sortinghat.py [num_rooms] < survey_results.csv
num_rooms - number of rooms to divide participants up into
"""
answers = {
8 : {
"" : 0,
"I have never programmed." : 0,
'I program less than one a year.' : 1,
"I program several times a year." : 2,
"I program once a month." : 3,
"I program once a week or more." : 4
},
10 : {
"" : 0,
'I could not complete this task.' : 0,
'I could complete the task with documentation or search engine help.' : 1,
'I could complete the task with little or no documentation or search engine help.' : 2
},
11 : {
"" : 0,
'I am not familiar with Git.' : 0,
'I am familiar only with the name Git.' : 1,
'I am familiar with Git but have never used it.' : 2,
'I am familiar with Git because I have used or am using it.' : 3
},
13 : {
"" : 0,
'I am not familiar with\xc2\xa0unit testing or code coverage.' : 0,
'I am familiar only with the terms "unit testing" and "code coverage".' : 1,
'I am familiar with\xc2\xa0unit testing or code coverage\xc2\xa0but have never used it.' : 2,
'I am familiar with\xc2\xa0unit testing or code coverage\xc2\xa0because I have used or am using them.' : 3
},
17 : {
"" : 0,
'I am not familiar with\xc2\xa0the command line.' : 0,
'I am familiar only with the\xc2\xa0term "command line".' : 1,
'I am familiar with\xc2\xa0the command line\xc2\xa0but have never used it.' : 2,
'I am familiar with\xc2\xa0the command line\xc2\xa0because I have used or am using it.' : 3
}
}
reader = csv.reader(sys.stdin)
try:
people = []
for j, row in enumerate( reader ):
if j == 0: continue
total = 0
for i in answers.keys():
total += answers[i][row[i].strip()]
people.append({'score': total, 'name': row[1], 'email': row[2]})
people.sort(key = lambda x: x['score'])
if len(sys.argv) > 1:
rooms = int(sys.argv[1])
room_size = len(people)//rooms
for room in xrange(0,len(people), room_size):
members = people[room:room+room_size]
print "Room {0}: scores {1[score]}-{2[score]}".format(room, members[0], members[-1])
print ", ".join(map(lambda x: x['email'], members))
print
else:
for person in people:
print "{score} {name} {email}".format(**person)
except IndexError, e:
print >> sys.stderr, 'INDEX', row, '::', i
except KeyError, e:
print >> sys.stderr, 'KEY', e, row, '::', answers[i].keys()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment