How Scientists and Librarians Can Tackle the Reproducibility Crisis – Together

Letisha R. Wyatt, PhD
Post Image

(Note: This is the second installment of three in a blog series titled “A Scientist in Librarian’s Clothing.” In part one, librarian Letisha R. Wyatt, Ph.D., discussed the challenging life of being an academic researcher.)

Prior to taking up the challenge of getting a doctorate, I worked at a clinical research organization (CRO) in San Francisco as a project/data manager for clinical trials. I assumed that my graduate training experiences would have the same level of control as seen in industry labs and clinical research.

In those environments, protocols were developed, deviations noted, standard operating procedures existed, and everything was documented. Also, steps were taken to secure and archive all the data for future use.

Alas, my experience didn’t reflect those expectations.

Sure, Good Laboratory Practice (GLPs) guidelines existed and were often discussed in my graduate program within the pharmacy school. However, as academics, we were seemingly exempt from following these procedures.

Good Scientific Procedures DO Exist

Nevertheless, there are steps scientists can take to mitigate some reproducibility issues, and really, just make their lives easier (in the end)! So, my recommendations to grad students and early career scientists include:

  • Construct master plans for your data! Scientists often spend many months crafting the perfect research grant proposal. Then they get funded (hooray!), and jump Data Management Drawingright into data collection —  without considering the important details of how to be data good stewards, from project start to finish. Maybe you should consider the data management plan tool: It’s a simple online form that prompts you with data-related questions — prior to starting a project.
  • Learn data management best practices … and figure out practical ways to implement them in your work. Much of what is taught may seem elementary or common sense in data management training. But, honestly, the challenge comes in the actual use of best practices. How can you standardize as much of your work as possible? A recent article, “Data Organization In Spreadsheets, uses real-life examples to demonstrate best practices. These are usable even with simple tasks, such as spreadsheet data entry. Also, “How To Share Data For Collaboration” is another very useful (and brief) manuscript with practicable ideas worth discussing in a lab meeting, or with your research group.
  • Build skills: learn basic programming in Python or R. I’m not saying that all bench scientists must become expert programmers or data scientists. But, there are many benefits when using coding for increased reproducibility. That’s because more of your work can be automated — the less manual interference for, say, cleaning up data in a spreadsheet, the lower the chances of “contaminating” your work. Also, methods are more clearly articulated, particularly for data analysis. Of course, there is a bit of a learning curve here, but the more buy-in for multiple researchers within a group, the higher the likelihood of success. A great place to start is the Software and Data Carpentry site. The organization also offers workshops globally.

  • Visit your academic library. It’s not all about books and journals in there! During my doctoral training, the Norris Medical Library at the University of Southern California helped me in a multitude of ways. It provided resources, helping me to find (eventually unaided) information efficiently, as well as how to use a variety of applications. You shouldn’t overlook your library, as you’ll often find people working there with a diversity of skillsets.

(In part three, we’ll look at what librarians can do to help with reproducibility)