Information Management

The goal of the PIE LTER data and information system is to provide a centralized network of information and data related to the PIE.

This network provides researchers access to common information and data in addition to protected long-term storage. Data and information are also easily accessible to local, regional, and state partners and the broader scientific community. Researchers associated with PIE are committed to the integrity of the information and databases resulting from the research.

In order to submit data for publication, e-mail the data and the completed Metadata Template to pie_im@mbl.edu using the PIE LTER Data and Metadata Best Practices Guide. The PIE Information Manager will gladly assist with data archiving and publishing. Please reach out with any questions.

Research Project Design

Data management and design of research projects is coordinated through the information management team. We encourage all students to meet with the information management team regarding the design of the specific research project and subsequent incorporation of data and information into the EDI database. For immediate assistance, please contact the Information Manager.

Data Submission

All PIE researchers are participants in the LTER Network and are required to submit research data and metadata in accordance with the LTER Data Access Policy. Researchers using PIE facilities are expected to comply with the LTER policy even if they are not funded by the LTER. We publish this data in the Environmental Data Initiative Repository at the time research results are peer-reviewed and published, or no later than 2 years after collection.

Data files should be submitted with the completed Metadata Template to the Information Manager (IM). The IM will review the data and metadata to ensure they meet PIE and LTER data standards and comply with the PIE LTER Data and Metadata Best Practices. Individual researchers are responsible for quality assurance, quality control, data entry, validation, and analysis for their respective projects.

If your dataset is an update to an existing dataset on EDI, please download the data package from EDI, make necessary edits, and send to the IM instead of creating a new data file and metadata file from scratch. Spatial data should be submitted as a zipped directory of raster or vector data and metadata. See here for best practices regarding spatial data. If you are using ArcGIS, we suggest using the built-in ArcCatalog to generate metadata.

Data Availability

PIE data is available for download from the Environmental Data Initiative (EDI) website and can also be browsed on our website at the PIE data catalog. PIE complies with the LTER Network Data Access Policy. We make every effort to release data in a timely fashion and with attention to accurate and complete metadata. PIE strives to make datasets easily accessible to PIE scientists, local, regional, and state partners, and the broader scientific community. Datasets are available across the broad breadth of PIE research in the watersheds and estuary. All PIE data is licensed under CC BY 4.0.

Data Management Resources

Coding Club is a group of ecology and environmental science students and researchers from the University of Edinburgh that create tutorials and courses in coding, data science and statistics with examples in R, Python, JavaScript and Python. They offer a free and self-paced Data Science for Ecologists and Environmental Scientists course to learn to use R to manipulate, graph and analyse ecological data, or build on your existing skills to create advanced data visualisations or master new analysis techniques such as mixed-effect modelling, ordination and more. Coding Club is for everyone, regardless of their career stage or current level of knowledge.

Fundamentals of Data Visualization by Claus O. Wilke is a guide to making visualizations that accurately reflect the data, tell a story, and look professional. The entire book is written in R Markdown, using RStudio as a text editor and the bookdown package to turn a collection of markdown documents into a coherent whole. The book’s source code is hosted on GitHub, at https://github.com/clauswilke/dataviz.

The DataOne Data Management Skillbuilding Hub is a repository for open educational resources regarding data management. DataOne also offers monthly webinars with discussions on open science, the role of the data lifecycle, and achieving innovative science through shared data and ground-breaking tools.

The NCEAS Learning Hub supports environmental scientists throughout their data science journey. They teach cutting-edge data science curriculum, facilitate collaborative learning, and promote best practices in open science. While their courses have a fee, many of their training materials are available for free.The NEON Learning Hub offers educational resources that include online tutorials to gain the data skills needed to work with NEON data, teaching modules and materials for professors to use in graduate and undergraduate classrooms, and short, insightful videos about a wide variety of science topics.