Experienced data scientist and software engineer with 12 years spent designing, conducting, and
sharing results of complex research. Worked with faculty, staff, and students from dozens of top
universities to achieve excellence in the preparation and performance of research-related tasks,
including data management, hypothesis formulation, and both mathematical and statistical modeling.
Possesses a diverse set of software skills, including DevOps, full-stack app development,
performance benchmarking, ETL, and cloud deployments. Enjoys hosting workshops and hackathons to
train users in state-of-the-art technologies and best practices. Familiar with a large number of
data formats, with a strong concentration on updating local file standards (e.g., binary, CSV, JPG,
etc.) to cloud-performant ones (HDF5, Zarr). Knowledgeable about various mathematical and
statistical applications, including high-dimensional time series analysis, nonlinear dynamics, and
machine learning.
Relevant coursework: machine learning, Bayesian statistics, network science, stochastic analysis, time series analysis, partial differential equations, nonlinear dynamics
Relevant coursework: calculus, linear algebra, probability, statistics, discrete & combinatorics, numerical analysis, real analysis, mathematical physics, algorithms & data structures, cryptography, machine learning, symbolic logic, cognitive psychology, neurophysiology, advanced neuroscience, philosophy of mind, philosophy of science
Developed and maintained 5 repositories of open-source software by ensuring proper functionality of
automated testing suites, documentation, tutorials, and demos.
Created and maintained 12+ data processing pipelines for neuroscience
labs, allowing their data to flow from acquisition to sharing in a seamless fashion.
Personally curated a total of 256 TB of high-value datasets to NIH data archives on behalf of various
research groups.
Managed the company's cloud resources (Amazon Web Services; AWS), including storage,
compute resources, and identity access management (IAM).
Handled user interactions within the research community by offering technical support and
resolving issues or feature requests in a timely manner.
Facilitated user education across various platforms by running sessions at multiple conferences
and workshops, increasing user adoption and effective system utilization.
All software related in some way to the facilitation of terabyte-scale data management, analysis,
and visualization for the field of neurophysiology.
Reconciled the computational properties of biologically realistic neural networks with artificial
machine learning models (such as those used across computer vision) through complex mathematical
theory and stochastic biophysical simulations, with results communicated through 3 journal
publications and a presentation at the high-impact COSYNE conference
(top 4% of abstracts accepted).
Collaborated with several top experimentalists in neuroscience as a trainee in the NeuroNex
program, which focused on understanding how neural function emerges from underlying structure.
Ran 4 tutorial sections for 112 students, aiding their study of course concepts to achieve a 96% satisfaction rate. Graded homework assignments and exams. Filled in main lectures for the professor as needed.
Analyzed geological data from water samples in conjunction with the Indiana Department of Environmental Management to issue compliance permits for Indianapolis regulation standards.
Reviewed and corrected over 300 pages of "An Idiot's Guide to Algebra II".
Explored statistical effects of algorithmic curation (the use of automated filtering mechanisms in the delivery and display of information) by measuring properties of simulated models of social networks, with results presented at the MIDSURE conference.
Examined information diffusion through large-scale simulations of G-protein signaling mechanisms using a high-performance super-computing cluster (HPC), then presented results at an undergraduate symposium.
Developed an intuitive user interface for file management using interactive validation and
real-time suggestions, streamlining the process for data submission to NIH archives.
Ensured an extremely robust testing suite involving multiple levels of integration and user
interactions emulated using Puppeteer to enhance reliability and long-term maintainability.
Tracked and documented dozens of hands-on user tests to refine the user experience.
Led the development of an automated data conversion tool capable of reading more than 40 distinct
data formats used by neurophysiology experiment devices in order to automatically write to the
NeurodataWithoutBorders (NWB) standard.
Designed universal APIs which transparently handled each layer of complexity to simplify the
tasks of tagging, grouping, metadata transcription, temporal alignment, asset linking, buffering,
chunking, and compression.
Implemented a distributed cloud deployment system to run large-scale, off-site, batched conversions
through Amazon Web Services (AWS).
Created a command line tool used by the NIH data archive to validate all data uploads, which
ensures all submissions are provided automated suggestions for metadata improvements that enhance
data findability and reuse.
Mirrored the design, style, and functionality of linting tools such as flake8, pydocstyle, and ruff.