June Line-up of CaRCC People Network and RCD Community Calls

Greetings Community!

Please mark your calendars for our upcoming People Network Track calls for the month of May, with Zoom details below. For handy calendar entries, please try the CaRCC Events Calendar.

We’d also like to highlight a number of calls from our RCD Ecosystem partners and collaborators, as these events touch many, if not all, in our community.

Data-Facing Track (first Tuesdays)

Tuesday, June 1st, 1p ET/ 12p CT/ 11a MT/ 10a PT

Open Science Framework Integrations discussion facilitated by Eric Olson, Center for Open Science.

Eric Olson of the Center for Open Science will discuss the Open Science Framework and new integrations with the framework. The OSF is a free, open platform to support your research and enables collaboration, data management and sharing. The goal is to meet researchers where they work. Third party integrations enable researchers to work in other platforms such as Google Drive, Box, or Dropbox, and connect all the project related content under one resource. New integrations for tools such as OneDrive for Business will be released soon.

Please note that the HathiTrust talk has been rescheduled for July.

Researcher-Facing Track (second Thursdays)

We will pause for the month of June for the 2021 Virtual Residency Summer Workshop for Research Computing Facilitation, June 7 – 11. Sessions cover introductory to advanced topics, one need not attend all sessions, and registration is free. We encourage all to attend.

Emerging Centers Track (third Wednesdays)

Wednesday, June 16th, 12pm ET/ 11am CT/ 10am MT/ 9am PT/ 7am HT

Speak above the noise. In the age of over communication via many channels, one common need among research service providers, particularly Research Computing Centers, is reaching their audience and getting the word out that their services exist.  At the June EC call, we will have an interactive brainstorming session identifying the best communication channels, best delivery methods (slides, handouts, webinars…) and key items which should be included in communication materials that are used to provide the first communication to faculty, students and administrators to familiarize them with research computing resources available at our institutions.  The outcome of this session will provide inputs to the PEARC ’21 Workshop:  Refining Your Research Computing Pitch, discussion facilitated by Jane Combs, Rich Knepper

Systems-Facing Track (third Thursdays)

Thursday, June 17th, 11a ET/ 10a CT/ 9a MT/ 8a PT  << NOTE TIME CHANGE FOR JUNE!!

Two unique approaches to developing CUI Compliant Systems, discussion facilitated by Carolyn Ellis, Purdue University; and Erik Deumens, University of Florida

Carolyn Ellis from Purdue University and Erik Deumens of the University of Florida will present on their unique approaches to the challenges of supporting controlled unclassified information (CUI) in their respective academic setting.

And please also note these additional community opportunities (in no particular order):

June 7th – 11th: The 2021 Virtual Residency Introductory/Intermediate/Advanced Workshop. We highly recommend this free, excellent, in-person and virtual conference for anyone interesting in RCD facilitation skills. And also as many in the community who attend and participate in researcher-facing calls are presenters and contributors. There’s NO PREREQUISITE other than an interest in helping researchers with their computing-intensive/data-intensive research. Registration is now open.

 June 24th, 1p ET / 12p CT / 11a MT / 10a PT, EDUCAUSE RCD June Open Call. For more information on the RCD group and joining this and future calls, please see the Research Computing and Data Community Group web page

June 22nd, 11a PT/ 12p MT/ 1p CT/ 2p ET  (4th Tue each month) & 2nd  Wed 8a PT/ 9a MT/ 10a CT/ 11a ET: Office Hours for The Research Computing and Data Capabilities Model. Have questions about how to get started Or are you already working with it and just want to discuss the process, or a particular aspect of the assessment tool? Join us to get help, ask your questions, and share your experiences!

June 10th, 12p: US Research Software Engineers Monthly Community Call. Please see the Get Involved page for more information.

General Track Call Information

Interested members of the People Network need not subscribe to a particular track to participate in calls. Additional details for track members, including notes documents and any pre-call activities, will be distributed ahead of the call via the email lists and other communication channels within each track.

The CaRCC (Campus Research Computing Consortium) People Network, aims “to foster, build and grow an inclusive community (termed the “People Network”) for campus CI, research computing and data professionals.” If you would like to join the People Network, which includes Researcher-facing, Data-facing, Systems-facing, and other tracks, please fill in the form at http://bit.ly/join_carcc_people_network.

All calls will take place within the same Zoom room distributed via email. Please join the People Network (link just above) or contact help@carcc.org for details.

Announcing the V1.1 Release of the RCD Capabilities Model

Internet2, CaRCC, and EDUCAUSE are pleased to announce that there is a new version of the Research Computing and Data Capabilities Model (RCD CM) just in time for you to complete your Institutional assessment and contribute to the 2021 RCD CM Community Dataset!

The Research Computing and Data Capabilities Model allows institutions to assess their support for computationally- and data-intensive research, to identify potential areas for improvement, and to understand how the broader community views Research Computing and Data support.

What’s new in the v1.1 release?

In response to community feedback, we have simplified a few things to make it easier to access and complete your institutional assessment:

  • A simpler Access Request form requires less information to complete. We’re now using the 2018 Carnegie Classification data to get your institution type (R1, R2, etc.), minority serving status, etc. 
  • A number of users felt the Local Weight/Relevance column was confusing, so we removed it, and incorporated a simpler variant on this elsewhere.
  • Many users felt that the Multi-Institutional Collaboration answers weighed too heavily in the computed coverage. We updated the calculations to reduce the impact of this question, and to give a boost (extra credit!!) to institutions that were sustaining or leading collaborations for a given topic.

The bulk of the Model and the questions remain essentially the same as in v1.0, so if you’ve attended one of our workshops or webinars, it will still look very familiar, and you can happily discuss your experience with everyone who completed an assessment in 2020!

Why should we contribute our RCD CM Assessment to the 2021 Community Dataset?

We’re so glad you asked! Contributing institutions help build the Community Dataset, and enrich the resulting picture we have of our community. You will also get an individualized Benchmarking report that compares your assessment to the community, and segments of the community (R1s, R2s, Public or Private institutions, etc.). Finally, you get access to the anonymized, aggregated dataset, if you really like digging into data. 

Sounds good – how do we get our copy and complete the assessment?

First, go to the new Access Request Form and fill it out. That will create an individualized copy of the v1.1 RCD CM Assessment tool for your institution. Then, read the Introduction and Guide to Use, and gather your team to complete the assessment. You can get help from the RCD CM working group by emailing capsmodel-help@carcc.org, or subscribe to the capsmodel-discuss@carcc.org discussion list (click here to join) to hear how other institutions are completing their assessment, and using it as part of strategic planning. You can also get help by joining our RCD CM Office Hours.

Here’s a timeline suggested by folks who worked on assessments last year:

In May: Get your copy and review the Guide. Think about who should be on the team that will complete the assessment (including, e.g., partners in central IT and/or the library). Your team can be just a few people (at smaller and emerging programs) or you could build a team with specialists for each facing.

In June: Your teams should meet to discuss the assessment, fill out most of it. Identify specific questions that may need additional input, and who can help with them.

In July: Review your assessment, discuss with your advisory board, and mark priorities for attention. We’re also holding a Workshop at PEARC21 on Strategic Planning using the Model. Watch the website for details!

In August: Submit your assessment to the 2021 Community Dataset (details to follow this summer).

In Fall/Winter 2021: Receive your Individualized Benchmarking report, the Contributors’ detailed Community Dataset report, and the full dataset if you’d like to dig deeper. 

What if we’ve already started with the original version of the Assessment?

Not a problem! It is largely the same, so just get a copy of the new one (as described above), and fill it out. With a little care you can even copy your answers over from the old spreadsheet (we’ll post instructions soon on the website). Feel free to contact us if you want help with this to ensure your data is transferred correctly.

2021 RCD Capabilities Model Office Hours

Have questions about how to get started with the Research Computing and Data Capabilities Model? Or are you already working with it and just want to discuss the process, or a particular aspect of the assessment tool? Join working group members and your colleagues in the community at one of our upcoming Office Hours to get help, ask your questions, and share your experiences!

Office Hours for 2021 are scheduled on the 4th Tuesday of each month for 1 hour, at 11 a.m. PT / 12 noon MT / 1 p.m. CT / 2 p.m. ET. Those dates are:

  • April 27
  • May 25
  • June 22
  • July 27
  • August 24
  • September 28
  • October 26
  • November 23

For details on how to join, email the RCD CM working group at capsmodel-help@carcc.org, or subscribe to the capsmodel-discuss@carcc.org discussion list.

May 2021 People Network Calls

Greetings, everyone. We hope your Spring is groing well thus far on this last day of April.

Please mark your calendars for our upcoming People Network Track calls for the month of May, with Zoom details below. For handy calendar entries, please try the CaRCC Events Calendar.

Data-Facing Track (first Tuesdays)

Data Feminism: A discussion of Intersectional Data, discussion facilitated by Professor Catherine D’Ignazio

Tuesday, May 4, 2021 at 1p ET/ 12p CT/ 11a MT/ 10a PT

As data are increasingly mobilized in the service of governments and corporations, their unequal conditions of production, their asymmetrical methods of application, and their unequal effects on both individuals and groups have become increasingly difficult for data scientists–and others who rely on data in their work–to ignore. But it is precisely this power that makes it worth asking: “Data science by whom? Data science for whom? Data science with whose interests in mind?” These are some of the questions that emerge from what we call data feminism, a way of thinking about data science and its communication that is informed by the past several decades of intersectional feminist activism and critical thought. Illustrating data feminism in action, this talk will show how challenges to the male/female binary can help to challenge other hierarchical (and empirically wrong) classification systems; it will explain how an understanding of emotion can expand our ideas about effective data visualization; how the concept of invisible labor can expose the significant human efforts required by our automated systems; and why the data never, ever “speak for themselves.” The goal of this talk, as with the project of data feminism, is to model how scholarship can be transformed into action: how feminist thinking can be operationalized in order to imagine more ethical and equitable data practices.

Researcher-Facing Track (second Thursdays)

Workflow Tools for Portable, Reproducible Data Analysis

Thursday, May 13 at 1pm ET/ 12pm CT/ 11am MT/ 10am PT/ 7am HST

Workflow managers are tools that can be used to orchestrate steps in data preparation, analyses, and visualization that can be deployed on local computers, HPC/HTC clusters, and cloud environments. These steps can be formalized as pipelines, made reproducible, and used to scale. As the landscape continues to mature, join us for a discussion on the current state of these systems and how these will help your research. After an introduction to workflow standards and tools, including reasons for using them, our speakers will present several use cases in research highlighting workflow tools such as SnakeMake and NextFlow.

Emerging Centers Track (third Wednesdays)

Emerging Centers Reflecting Back and Looking forward, discussion facilitated by Jane Combs and Rich Knepper

Wednesday, May 19 at 12pm ET/ 11am CT/ 10am MT/ 9am PT/ 7am HT

We’ll be discussing the Emerging Centers calls so far (18 of them!) and identifying what members want to discuss in the coming months. We will be sending out some pre-meeting materials to help prompt our memories of prior talks and ideas for future discussion. Track chairs will also discuss the upcoming PEARC workshop activity on “Refining your Research Computing Pitch”.

Systems-Facing Track (third Thursdays)

Clusters in the Sky: How Canada is Building Beyond Iaas Cloud with Magic Castle, discussion facilitated by Jeff Albert, University of Victoria / Compute Canada, and Félix-Antoine Fortin, Université Laval / Compute Canada

Thursday, May 20th at 1p ET/ 12p CT/ 11a MT/ 10a PT

During this session, we will tell the story of the evolution of Canadian IaaS research cloud services. We will cover users’ needs for tools that help them marshal computing resources into functional platforms.  As a proposed solution to these needs, we will present project Magic Castle that takes general purpose cloud resources and makes them into useful scientific toolkits. Finally, we will also discuss how the infrastructure and tools initiatives are evolving together in order to meet the increasingly specialized needs of researchers.​

General Track Call Information

Interested members of the People Network need not subscribe to a particular track to participate in calls. Additional details for track members, including notes documents and any pre-call activities, will be distributed ahead of the call via the email lists and other communication channels within each track.

The CaRCC (Campus Research Computing Consortium) People Network, aims “to foster, build and grow an inclusive community (termed the “People Network”) for campus CI, research computing and data professionals.” If you have received this email NOT via CaRCC’s People Network, and you would like to join the People Network, which includes Researcher-facing, Data-facing, Systems-facing, and other tracks, please fill in the form at http://bit.ly/join_carcc_people_network.

People Network Calls, April 2021

Greetings, everyone. Welcome to Spring and April… No Joke!

Please mark your calendars for these upcoming People Network remote Zoom meetings. For handy calendar entries please try the CaRCC Events calendar.

Data-Facing Track (first Tuesdays)

Digital Scholarship Platforms and Workflows – HathiTrust Research Center and Model of Models
Eleanor Koehl (HathiTrust) and Erin McCabe (University of Cincinnati)
Tuesday, April 6, 1p ET/ 12p CT/ 11a MT/ 10a PT/ 8a HT

HathiTrust is the largest non-profit digital library in the world, and roughly 2/3 of the collection is not available for human reading. This presentation will discuss how the HathiTrust Research Center leverages compute resources at Indiana University to make text data from the HathiTrust Digital Library available for text data mining. HTRC services attempt to meet the needs of scholars with a range of skill levels, who use a variety of research methods.

UC’s Digital Scholarship Center (DSC) has developed its own platform for text mining and visualization of large-scale unstructured language datasets. This presentation will provide a demonstration of the platform in addition to reviewing topic modeling concepts that form its primary visualizations. Additionally, we will look at 1-2 analytical approaches to the platform’s output, as well as cover some of the DSC’s work / challenges with data curation, parallel modeling, and working with researchers across disciplines. 

Researcher-Facing Track (second Thursdays)

On Measuring the Impact of Training
Presentations by Kari Jordan (Carpentries), and Julie Wilson Rojewski and Astri Briliyanti, CyberAmbassadors
Thurs, April 8th, 1p ET/ 12p CT/ 11a MT/ 10a PT/ 8a HT

On previous Researcher-Facing calls, we’ve had the opportunity to discuss topics relating to measuring impact and improving training. And as discussed in the Leading Practices of Facilitation, “training & education” is one of the major pillars of our efforts. Many of us provide training opportunities and struggle to define and measure “impact” or “success” — is it short terms gains (quality scores for the class & instructors, reduced support burden, and acclimating users), long term considerations (effectiveness of training programs, building relationships, promoting awareness and participation), and does it depend on the kind of training (professional skills, technical topics)?  Or are we confounding these, complicating both the objectives and outcomes?

April’s call will showcase two “case studies” of measuring training impact, where each presenter will talk about their programs, define “impact”, and explain their approach to measuring this. Please also join us by contributing to our pre-talk survey: What challenges do you currently face in measuring training impact? And what successful strategies have you tried?

Emerging Centers Track (third Wednesdays)

New Resources Available to the National Research Community: Jetstream 2, Bridges 2, and Anvil
Wednesday, April 21st, 12pm ET/ 11am CT/ 10am MT/ 9am PT/ 7am HT

Representatives from Indiana University, Pittsburgh Supercomputing Center, and Purdue University will discuss the new systems at each of these sites and their capabilities offered to the national research community. All of these resources will be available to researchers nationwide via the XSEDE project allocations system.

Systems-Facing Track (third Thursdays)

Experiences and Advice for Large and Small Data Centers – Cooling
Thursday April 15th, 1p ET/ 12p CT/ 11a MT/ 10a PT/ 8a HT  

Our panel will discuss experiences managing cooling (heat) in data centers – from large scale systems to clusters in closets. The brief presentations will include experiences designing and managing cooling for their infrastructure. Our panelists will take questions from participants and discuss options. Any questions about power/security will be collated for a future session.

General Track Call Information

Interested participants need not subscribe to a particular track to participate in calls. However, additional details for track members, including notes documents and any pre-call activities, will be distributed ahead of the call via the email lists and other communication channels within each track.

All calls will take place within the same Zoom room distributed via email. Please join the People Network (link just above) or contact help@carcc.org for details.

The CaRCC People Network aims “to foster, build and grow an inclusive community (termed the “People Network”) for campus CI, research computing and data professionals.” If you have received this information NOT via CaRCC’s People Network email list and you would like to join the People Network – Researcher-Facing, Data-Facing, Systems-Facing, Emerging-Centers, and other tracks – please fill in our Join the People Network form.

Join us for March People Network Calls

Mark your calendars for these upcoming People Network virtual meetings. (For handy calendar entries, try the CaRCC Events Calendar.)

Plenary session hosted by the Data-Facing Track (first Tuesdays)

Tuesday, March 2, 1p ET/ 12p CT/ 11a MT/ 10a PT

PLENARY: Using Data to Benchmark your Research Computing and Data Program: The RCD Capabilities Model and Community Dataset

Presenters: Claire Mizumoto, UC San Diego & Patrick Schmitz, Semper Cognito Consulting

Join us during the regular data-facing slot for the first (and hopefully not last) People Network plenary session! Claire Mizumoto and Patrick Schmitz from the Capabilities Model working group will present the results from the first community dataset. These assessments were completed using the 1.0 version of the Research Computing and Data Capabilities Model (RCD CM), over a period of several months in the Spring and Summer of 2020. This Community Dataset provides insight into the current state of support for RCD across the community and in a number of key sub-communities.

Across science, engineering, social sciences, and the humanities, every university depends upon research computing and data (RCD) professionals and infrastructure. The rapid evolution and diversification of RCD infrastructure, services, and support poses significant challenges to academic institutions as they try to effectively assess and plan for the growing needs of researchers. Many institutions would also like to assess their capabilities in comparison to peers. The lack of a shared vocabulary to describe the various aspects of RCD support hinders coordinated efforts to advance support of and for researchers. These challenges are especially acute for smaller and emerging RCD support organizations, which often lack experience supporting RCD and have limited resources to develop an analysis framework for strategic planning. To address these gaps, a collaborative team developed a Research Computing and Data Capabilities Model that allows an organization to self-evaluate across a range of RCD services. The Model provides structured input to guide strategic planning, leveraging a defined and shared community vocabulary and enabling benchmarking relative to peer institutions.

Researcher-Facing Track (second Thursdays)

Thursday, March 11, 1pm ET (12pm CT/11am MT/10am PT)

All About Orienting Researchers to Research Computing + Data Resources

Continue reading “Join us for March People Network Calls”

2020 RCD CM Community Dataset report available

Scatter Plot showing the capabilities coverage for all 41 institutions

The report describes the first Research Computing and Data Capabilities Model Community Dataset, aggregating the assessments of 41 Higher Education Institutions. These assessments were completed using the 1.0 version of the Research Computing and Data Capabilities Model (RCD CM), over a period of several months in the Spring and Summer of 2020. This Community Dataset provides insight into the current state of support for RCD across the community and in a number of key sub-communities. Download it now! (zenodo.org/record/4344057).

The Research Computing and Data Capabilities Model allows institutions to assess their support for computationally- and data-intensive research, to identify potential areas for improvement, and to understand how the broader community views Research Computing and Data support. The Model was developed by a diverse group of institutions with a range of support models, in a collaboration among Internet2, CaRCC, and EDUCAUSE. This Assessment Tool is designed for use by a range of roles at each institution, from front-line support through campus leadership, and is intended to be inclusive across small and large, and public and private institutions. 

We encourage you to check out the 1.0 release, and begin to use it at your institution. Start with the Capabilities Model Introduction and Guide to Use, which includes background as well as tips for using the model, and a link to the access request form that will create a personalized copy of the Assessment Tool for your institution.  You can also watch the recording of the EDUCAUSE webinar. Keep an eye on the RCD CM working group page for more information and updates.

Welcome to February 2021!

Please see below for People Network calls this month, and make sure to join in! (For handy calendar entries, see the CaRCC Events Calendar.) 

Data-Facing Track (first Tuesdays)

Tuesday, February 2, 1p ET/ 12p CT/ 11a MT/ 10a PT

Casual Tuesday Community Roundtable

We want to take this month to hear a bit about what everyone is up to. This session will be a general sharing and free-form brainstorming session. We’d love to hear 3-5 minutes about projects that people are working on currently or new developments. If you’re stuck on something, feel free to bring that forward and get advice from the brain trust. We can also make breakout rooms for deeper discussions that arise.

Researcher-Facing Track (second Thursdays)

Thursday, February 11, 1pm ET (12pm CT/11am MT/10am PT)

Supporting Researchers with Containers

Continue reading “Welcome to February 2021!”

People Network Calls in the New Year

We hope you have a wonderful holiday season and are able to join People Network calls in January! (For handy calendar entries, see the CaRCC Events Calendar.) 

Data-Facing Track (first Tuesdays)

Tuesday, January 5, 1p ET/ 12p CT/ 11a MT/ 10a PT

Python for Big Data

Presenter: Bala Desinghu, Rutgers University

Python is a popular programming language for developing software and data science applications. Its popularity stems from many factors such as simplicity, readability, portability, etc. As such, Python is slow compared to C or Fortran and it does not manage memory well. These limitations in speed and memory management may not be significant when analyzing small data sets, but it becomes a bottleneck when analyzing big data sets. Techniques based on vectorization, parallelization, just in time compilation, and distributed task executions have been widely adopted by the Python community to address these challenges associated with big data. This presentation will address a few techniques suitable for large scale data analysis and answer the following questions: What to do when the data set size exceeds the available physical memory? How to speed up the data analysis? How to distribute the workloads when doing machine learning for big data sets?

Researcher-Facing Track (second Thursdays)

Thursday, January 14, 1p ET/ 12p CT/ 11a MT/ 10a PT

All about CaRCC (… beyond the R-F Track)

Presenters:
Tom Cheatham, University of Utah
Lauren Michael, University of Wisconsin
Dana Brunson, Internet2
Patrick Schmitz, Semper Cogito Consulting

Continue reading “People Network Calls in the New Year”