I am a Research Scientist affiliated with the Data Intensive Systems Lab directed by Chris Jermaine. As data scientist, I am interested in “sensemaking” by levering data curation, the power of computing, and human curiosity for knowledge. This passion for data analytics stems from my experience working and collaborating with experts, researchers, and practitioners in a wide range of areas and projects. Presently, as part of the Pliny Project, we are working on a high performance distributed store and compute platform for big data. My data science research is anchored in a socio-technical perspective; that is, one that considers the interconnections between people, computing technology, and data.

Data-rich environments have been part of my journey since my undergraduate years. As Data Scientist for STEMscopes™, I focused on learning analytics and the big data generated by nearly half a million students and over 50,000 teachers using STEMScopes™ as their on-line science curriculum. In collaboration with educators and curriculum developers we incorporated a learning analytics framework to characterize pedagogical processes. Previously, we embedded a science game based on the TV show CSI in a virtual world (Wyville.net) with nearly six million users, and studied game play and problem solving patterns. This is a field I call Learninformatics (term coined during a NSF Big Data Ideas Lab), which can be defined as: an interdisciplinary approach to develop and improve methods for storing, curating, organizing and analyzing learning data. Learning understood as a life-long process that takes place both in formal and informal settings.

Prior to joining Rice, as member of the Texas A&M University’s Center for the Study of Digital Libraries, I worked on numerous interdisciplinary research projects. In collaboration with nautical archaeologists, we developed nadlShip, an ontology for characterizing ship properties and components. I created an algorithm for improving information contextualization, enhancing document retrieval by combining domain knowledge and the ontology, helping archaeologists in the reconstruction of ancient sunken ships. As of 2017, Drs. Filipe Castro (Texas A&M Ship Reconstruction Lab) and Pierre Drap (French National Centre for Scientific Research, LSIS) use this ontology for 3D visualization of ship structures.

In other textual and data-rich analytics projects, I worked on a collection of Picasso’s paintings and extensive historical narrative, with the goal of linking textual and graphical information to explore Picasso’s life and artistic creation. Similarly, I created an interactive timeline visualization tool for analyzing the evolution of Cervantes’ Don Quixote, using digital copies of the original printed books.

I received a B.S. from Universidad Rafael Landívar (Guatemala), followed by masters and Ph.D. degrees from Texas A&M University, all in Computer Science. In my undergraduate thesis, I implemented a genetic algorithm-based scheduler, which received an Outstanding Computer Science Thesis National Award. In 2014, I was selected to participate in the Heidelberg Laureate Forum, a gathering of 200 young mathematicians and computer scientists with Turing, Abel and Field laureates. That same year, our early work on learning analytics received an Outstanding Publication Award by the American Educational Research Association.

I am a member of the International Network of Guatemalan Scientists, the Association for Computing Machinery, the Association for Linguistic and Literary Computing and the IEEE Computer Society.  I also serve as reviewer for conferences and journals related to data mining, learning analytics, linguistic computing, and digital humanities. In addition, I have published in various peer-reviewed journals, conference proceedings and book chapters, and have presented in the United States, Europe and Latin America.