Publications

0 Publications, 0 Theses

2024  

  1. Alexandra Zytek, Sara Pido, Sarah Alnegheimish, Laure Berti-Equille, Kalyan Veeramachaneni

    Explingo: Explaining AI Predictions using Large Language Models
    BigData-2024. In Proceedings of IEEE International Conference on Big Data, December 2024.

  2. Sarah Alnegheimish, Laure Berti-Equille, Kalyan Veeramachaneni
    OrionBench: Benchmarking Time Series Generative Models in the Service of the End-User
    BigData-2024. In Proceedings of IEEE International Conference on Big Data, December 2024.

  3. Sarah Alnegheimish, Linh Nguyen, Laure Berti-Equille, Kalyan Veeramachaneni
    Large Language Models Can be Zero-Shot Anomaly Detectors for Time Series?
    DSAA-2024. In Proceedings of IEEE International Conference on Data Science and Advanced Analytics (DSAA), October 2024.

  4. Alexandra Zytek, Sara Pido, Kalyan Veeramachaneni

    LLMs for XAI: Future Directions for Explaining Explanations

    ACM CHI 2024, HCXAI. Workshop on Human-Centered Explainable AI, May 2024.

  5. Lei Xu, Sarah Alnegheimish, Laure Berti-Equille, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    Single Word Change is All You Need: Designing Attacks and Defenses for Text Classifiers
    Preprint. January 2024.

Theses

  1. Grace Y. Song
    Modeling Control Signals for Reconstruction-based Time Series Anomaly Detection
    M.Eng Thesis. Dept. of EECS, MIT. May 2024.

  2. Wei-En Warren Wang
    Why-did-the-prediction-change-Explaining-changes in-predictions-as-time-progresses
    M.Eng Thesis. Dept. of EECS, MIT. Feb 2024.

  3. Guanpeng Andy Xu
    SigPro-Enabling-Subject-Matter-Expert-Guidance-in-Feature-Engineering
    M.Eng Thesis. Dept. of EECS, MIT. Feb 2024.

2023  

  1. Alexandra Zytek, Wei-En Wang, Dongyu Liu, Laure Berti-Equille, Kalyan Veeramachaneni
    Pyreal: A Framework for Interpretable ML Explanations
    Preprint. December 2023.

  2. Alexandra Zytek, Wei-En Wang, Sofia Koukoura, Kalyan Veeramachaneni
    Lessons from Usable ML Deployments and Application to Wind Turbine Monitoring
    NeurIPS XAI in Action. Workshop on XAI in Action: Past, Present, and Future Applications, December 2023.

Theses

  1. Nassim Oufattole
    Towards Creating Synthetic Data Testbeds for Research
    S.M Thesis. Dept. of EECS, MIT. June 2023.

  2. Frances R. Hartwell
    Zephyr: a Data-Centric Framework for Predictive Maintenance of Wind Turbines
    M.Eng Thesis. Dept. of EECS, MIT. February 2023.

2022  

  1. Lawrence Wong, Dongyu Liu, Laure Berti-Equille, Sarah Alnegheimish, Kalyan Veeramachaneni
    AER: Auto-Encoder with Regression for Time Series Anomaly Detection
    BigData-2022. In Proceedings of IEEE International Conference on Big Data, December 2022.

  2. Lei Xu, Alfredo Cuesta-Infante, Laure Berti-Equille, Kalyan Veeramachaneni
    R&R: Metric-guided Adversarial Sentence Generation
    AACL-2022. In Findings of the Association for Computational Linguistics: AACL-IJCNLP , November 2022.

  3. Shubhra Kanti Karmaker Santu, Md. Mahadi Hassan, Micah J. Smith, Lei Xu, ChengXiang Zhai, Kalyan Veeramachaneni
    AutoML to Date and Beyond: Challenges and Opportunities
    CSUR-2022. In ACM Computing Surveys, November 2022.

  4. Dongyu Liu, Sarah Alnegheimish, Alexandra Zytek, Kalyan Veeramachaneni
    MTV: Visual Analytics for Detecting, Investigating, and Annotating Anomalies in Multivariate Time Series
    CSCW-2022. In Proceedings of the ACM Conference On Computer-Supported Cooperative Work And Social Computing, October 2022.

  5. Lei Xu, Laure Berti-Equille, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    In Situ Augmentation for Defending Against Adversarial Attacks on Text Classifiers
    ICONIP-2022. In Proceedings of the International Conference on Neural Information Processing, November 2022.
    AdvML-2022. In KDD Workshop on Adversarial Learning Methods for Machine Learning and Data Mining, August 2022.

  6. Alexandra Zytek, Ignacio Arnaldo, Dongyu Liu, Laure Berti-Equille, Kalyan Veeramachaneni
    The Need for Interpretable Features: Motivation and Taxonomy
    KDD Explorations. June 2022.

  7. Sarah Alnegheimish, Dongyu Liu, Carles Sala, Laure Berti-Equille, Kalyan Veeramachaneni
    Sintel: A Machine Learning Framework to Extract Insights from Signals
    SIGMOD-2022. In Proceedings of International Conference on Management of Data, June 2022.
    [code]

Theses

  1. Alicia (Yi) Sun
    Algorithmic Fairness in Sequential Decision Making
    Ph.D. Thesis. Dept. of IDSS, MIT. October 2022.

  2. Lei Xu
    Towards Deployable Robust Text Classifiers
    Ph.D. Thesis. Dept. of EECS, MIT. September 2022.

  3. Romain Palazzo
    Synthetic Data Assessment based on Model Improvement
    M.S Thesis. Dept. of Math, EPFL. July 2022.

  4. Lawrence C. Wong
    Time Series Anomaly Detection using Prediction-Reconstruction Mixture Errors
    M.Eng Thesis. Dept. of EECS, MIT. May 2022.

  5. Sarah Alnegheimish
    Orion – A Machine Learning Framework for Unsupervised Time Series Anomaly Detection
    S.M Thesis. Dept. of EECS, MIT. May 2022.

2021 

  1. Alexandra Zytek, Dongyu Liu, Rhema Vaithianathan, Kalyan Veeramachaneni
    Sibyl: Understanding and Addressing the Usability Challenges of Machine Learning In High-Stakes Decision Making
    VIS-2021. In IEEE Transactions on Visualization and Computer Graphics (TVCG), January 2022.

  2. Furui Cheng, Dongyu Liu, Fan Du, Yanna Lin, Alexandra Zytek, Haomin Li, Huamin Qu, Kalyan Veeramachaneni
    VBridge: Connecting the Dots Between Features and Data to Explain Healthcare Models **Honorable Mention Award**
    VIS-2021. In IEEE Transactions on Visualization and Computer Graphics (TVCG), January 2022.

  3. Micah J. Smith, Jürgen Cito, Kelvin Lu, Kalyan Veeramachaneni
    Enabling Collaborative Data Science Development with the Ballet Framework
    CSCW-2021. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work and Social Computing, October 2021.

  4. Yi Sun, Ivan Ramirez, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    Towards Reducing Biases in Combining Multiple Experts Online
    IJCAI-2021. In Proceedings of the International Joint Conference on Artificial Intelligence, August 2021.

  5. Alexandra Zytek, Dongyu Liu, Rhema Vaithianathan, Kalyan Veeramachaneni
    Sibyl: Explaining Machine Learning Models for High-Stakes Decision Making
    CHI-2021. In Extended Abstracts of ACM CHI Conference on Human Factors in Computing Systems , May 2021.

  6. Micah J. Smith, Jürgen Cito, Kalyan Veeramachaneni
    Meeting in the notebook: a notebook-based environment for micro-submissions in data science collaborations
    Preprint. March 2021.

Theses

  1. Zhuofan Xie
    Tracer: A Machine Learning Based Data Lineage Solver with Visualized Metadata Management
    M.Eng Thesis. Dept. of EECS, MIT. December 2021.

  2. Micah J. Smith
    Collaborative, Open, and Automated Data Science
    Ph.D. Thesis. Dept. of EECS, MIT. September 2021.

  3. Alexandra Zytek
    Towards Usable Machine Learning
    S.M. Thesis. Dept. of EECS, MIT. February 2021.

2020 

  1. Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks
    BigData-2020. In Proceedings of IEEE International Conference on Big Data, December 2020.

  2. Sarah Alnegheimish, Najat Alrashed, Faisal Aleissa, Shahad Althobaiti, Dongyu Liu, Mansour Alsaleh, Kalyan Veeramachaneni
    Cardea: An Open Automated Machine Learning Framework for Electronic Health Records
    DSAA-2020. In Proceedings of IEEE 7th International Conference on Data Science and Advanced Analytics, October 2020.

  3. Micah J. Smith, Carles Sala, James Max Kanter, Kalyan Veeramachaneni
    The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development
    SIGMOD-2020. In Proceedings of International Conference on Management of Data, June 2020.

  4. Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni
    Understanding User-Bot Interactions for Small-Scale Automation in Open-Source Development
    CHI-2020. In Extended Abstracts of ACM CHI Conference on Human Factors in Computing Systems, April 2020.

  5. Micah J. Smith, Kelvin Lu, Kalyan Veeramachaneni
    Demonstration of Ballet: A Framework for Open-Source Collaborative Feature Engineering
    MLSys-2020. Proc. Third Conference on Machine Learning and Systems, March 2020.

Theses

  1. Felipe Alex Hofmann
    Tracer: A Machine Learning Approach to Data Lineage
    M.Eng Thesis. Dept. of EECS, MIT. May 2020.

  2. Ajinkya Kishore Nene
    Deep Learning Approaches to Universal and Practical Steganalysis
    M.Eng Thesis. Dept. of EECS, MIT. May 2020.

  3. Kevin Zhang
    Tiresias: A Peer-to-Peer Platform for Privacy Preserving Machine Learning
    M.Eng Thesis. Dept. of EECS, MIT. February 2020.

  4. Katherine Wang
    A Machine Learning Framework for Predictive Maintenance of Wind Turbines
    M.Eng Thesis. Dept. of EECS, MIT. February 2020.

  5. Lei Xu
    Synthesizing Tabular Data using Conditional GAN
    S.M Thesis. Dept. of EECS, MIT. February 2020.

2019  

  1. Kevin Alex Zhang, Kalyan Veeramachaneni
    Enhancing Image Steganalysis with Adversarially Generated Examples
    CSCML-19. In Proc. 3rd International Symposium, CSCML 2019.

  2. Kevin Alex Zhang, Lei Xu, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    Robust Invisible Video Watermarking with Attention
    Preprint. 4 Sep 2019.

  3. Yi Sun, Ivan Ramirez, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    Learning Fair Classifiers in Online Stochastic Settings
    Preprint. 19 Aug 2019.

  4. Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    Modeling Tabular data using Conditional GAN
    NeurIPS 2019. Proc. of Advances in Neural Information Processing Systems, 2019.

  5. Lei Xu, Shubhra Kanti Karmaker Santu, Kalyan Veeramachaneni
    MLFriend: Interactive Prediction Task Recommendation for Event-Driven Time-Series Data
    Preprint. 28 Jun 2019.

  6. Kevin Zhang, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    SteganoGAN: High Capacity Image Steganography with GANs
    Preprint. 12 Jan 2019.

  7. Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, Huamin Qu
    ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning
    CHI-19. In Proc. ACM Conference on Human Factors in Computing Systems, 2019.

  8. Yi Sun, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    Learning Vine Copula Models For Synthetic Data Generation
    AAAI-19. In Proc. 33rd AAAI Conference on Artificial Intelligence, 2019.

Theses

  1. Ihssan Tinawi
    Machine Learning for Time Series Anomaly Detection
    M.Eng Thesis. Dept. of EECS, MIT. June 2019.

  2. Kelvin Liu
    Feature Engineering and Evaluation in Lightweight Systems
    M.Eng Thesis. Dept. of EECS, MIT. June 2019.

2018  

  1. Micah Smith, Kelvin Lu, Kalyan Veeramachaneni
    Ballet: A lightweight framework for open-source, collaborative feature engineering
    NeurIPS SysML. Workshop on Systems for ML and Open Source Software, December 2018.

  2. Gaurav Sheni, Benjamin Schreck, Roy Wedge, Max Kanter, Kalyan Veeramachaneni.
    Prediction Factory: automated development and collaborative evaluation of predictive models
    Preprint. 29 Nov 2018.

  3. Lei Xu, Kalyan Veeramachaneni
    Synthesizing Tabular Data using Generative Adversarial Networks
    Preprint. 27 Nov 2018.

  4. Dennis Wilson, Silvio Rodrigues, Carlos Segurad, Ilya Loshchilov, Frank Hutter, Guillermo López Buenfil, Ahmed Kheiri, Ed Keedwell, Mario Ocampo-Pineda , Ender Özcan, Sergio Ivvan Valdez Peña, Brian Goldman, Salvador Botello Rionda, Arturo Hernández-Aguirre, Kalyan Veeramachaneni, Sylvain Cussat-Blanc
    Evolutionary computation for wind farm layout optimization
    Renewable Energy. Volume 126, October 2018, Pages 681-691.

  5. Max Kanter, Benjamin Schreck, Kalyan Veeramachaneni
    Machine Learning 2.0: Engineering Data Driven AI Products
    Preprint. 1 Jul 2018.

  6. Zara Perumal, Kalyan Veeramachaneni
    Towards building active defense systems for software applications
    CSCML-18. In Proceedings of International Symposium on Cyber Security Cryptography and Machine Learning, Be'er Sheva, Israel, June 2018.

  7. Ignacio Arnaldo, Ankit Arun, Sumeet Kyathanahalli, Kalyan Veeramachaneni
    Acquire, adapt, and anticipate: continuous learning to block malicious domains
    IEEE Big Data 2018. In Proceedings of IEEE international conference on Big data, December 2018.

  8. Benjamin Schreck, Nitin John James, Shankar Mallapur, Rajendra Prasad, Sanjeev Vohra, Kalyan Veeramachaneni
    Augmenting Software Project Managers with Predictions from Machine Learning
    IEEE Big Data 2018. In Proceedings of IEEE international conference on Big data, December 2018.

Theses

  1. Andrew Montanez
    SDV: An Open Source Library for Synthetic Data Generation
    M.Eng Thesis. Dept. of EECS, MIT. August 2018.

  2. William Xue
    A Flexible Framework for Composing End to End Machine Learning Pipelines
    M.Eng Thesis. Dept. of EECS, MIT. May 2018.

  3. Laura Gustafson
    Bayesian Tuning and Bandits: An Extensible, Open Source Library for AutoML
    M.Eng Thesis. Dept. of EECS, MIT. May 2018.

  4. Akshay Ravikumar
    A Framework to Search for Machine Learning Pipelines
    M.Eng Thesis. Dept. of EECS, MIT. May 2018.

  5. BingFei Cao
    Augmenting the Software Testing Workflow with Machine Learning
    M.Eng Thesis. Dept. of EECS, MIT. May 2018.

  6. Micah J. Smith
    Scaling Collaborative Open Data Science
    S.M Thesis. Dept. of EECS, MIT. May 2018. (bibtex)

  7. Alexander Friedrich Nordin
    End to End Machine Learning Workflow Using Automation Tools
    M.Eng Thesis. Dept. of EECS, MIT. May 2018.

  8. Zara Perumal
    Towards Building Active Defense for Software Applications
    M.Eng Thesis. Dept. of EECS, MIT. February 2018.

  9. Caroline Morganti
    Applying Natural Language Models and Causal Models to Project Management Systems
    M.Eng Thesis. Dept. of EECS, MIT. February 2018.

2017  

  1. Thomas Swearingen, Will Drevo, Bennett Cyphers, Alfredo Cuesta-Infante, Arun Ross and Kalyan Veeramachaneni
    ATM: A Distributed, Collaborative, Scalable, System for Automated Machine Learning (Code)
    IEEE Big Data - 17. Proc. of 2017 IEEE International Conference on Big Data (Big Data 2017), Boston, MA, USA, December 2017.

  2. Roy Wedge, James Max Kanter, Santiago Moral Rubio, Sergio Iglesias Perez, Kalyan Veeramachaneni
    Solving the "false positives" problem in fraud prediction
    Preprint. 20 Oct 2017.

  3. Alec Anderson, Sebastien Dubois, Alfredo Cuesta-Infante and Kalyan Veeramachaneni
    Sample, Estimate, Tune: Scaling Bayesian Auto-Tuning of Data Science Pipelines
    IEEE DSAA - 17. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October 2017.

  4. Bennett Cyphers and Kalyan Veeramachaneni
    AnonML: Locally Private Machine Learning over a Network of Data Holders
    IEEE DSAA - 17. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October 2017.

  5. Micah Smith and Kalyan Veeramachaneni
    FeatureHub: Towards Collaborative Data Science
    IEEE DSAA - 17. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October 2017.

  6. Igacio Arnaldo, Alfredo Cuesta-Infante, Akit Arun, Mei Lam, Costas Bassias, and Kalyan Veeramachaneni
    Learning Representations for Log Data in Cybersecurity
    CSCML-17. International Symposium on Cyber Security Cryptography and Machine Learning, June 2017.

Theses

  1. Alec W. Anderson
    Deep Mining: Scaling Bayesian Auto-tuning of Data Science Pipelines
    M.E. Thesis, MIT Dept of EECS, August 2017.

  2. Bennett James Cyphers
    A System for Privacy-Preserving Machine Learning on Personal Data
    M.Eng. Thesis MIT Dept of EECS, August 2017.

  3. Donghyun Michael Choi
    SenseML: A Platform for Constructing IOT Data Pipelines
    M.E. Thesis, MIT Dept of EECS, August 2017.

  4. Jonathan Johannemann
    COAL: A Continuous Active Learning System
    M.Fin. Thesis, MIT Sloan school of management, June 2017.

  5. David Wong
    Build your own deep learner
    M.E. Thesis, MIT Dept of EECS, June 2017.

  6. Katharine Xiao
    Towards Automatically Linking Data Elements
    M.E. Thesis, MIT Dept of EECS, June 2017.

  7. John J.D. O’Sullivan
    Teach2Learn: Gamifying Education to Gather Training Data for Natural Language Processing
    M.E. Thesis, MIT Dept of EECS, February 2017.

2016  

  1. Bennett Cyphers, Kalyan Veeramachaneni
    AnonML: Anonymous machine learning over a network of data holders
    NIPS. NIPS workshop on Private multiparty communication Barcelona, Spain. December 2016.

  2. Alfredo Cuesta-Infante, Kalyan Veeramachaneni
    Markov Switching Copula Models for Longitudinal Data
    ICDM W-16. 11th International Workshop on Spatial and Spatiotemporal Data Mining Barcelona, Spain. December 2016.

  3. James Max Kanter, Owen Gillespie, Kalyan Veeramachaneni
    Label, Segment, Featurize: a cross domain framework for prediction engineering
    IEEE DSAA - 16. IEEE International Conference on Data Science and Advance Analytics Montreal, CA. October 2016.

  4. Benjamin Schreck, Kalyan Veeramachaneni
    What would a data scientist ask? Automatically formulating and solving prediction problems
    IEEE DSAA - 16. IEEE International Conference on Data Science and Advance Analytics Montreal, CA. October 2016.

  5. Neha Patki, Roy Wedge, Kalyan Veeramachaneni
    The synthetic data vault
    IEEE DSAA - 16. IEEE International Conference on Data Science and Advance Analytics Montreal, CA. October 2016.

  6. Ben Gelman, Matt Revelle, Carlotta Domeniconi, Aditya Johri, Kalyan Veeramachaneni
    Acting the Same Differently: A Cross-Course Comparison of User Behavior in MOOCs
    EDM-16. International conference on Educational data mining Raleigh, NC. July 2016.

  7. Kalyan Veeramachaneni, Ignacio Arnaldo, Alfredo Cuesta-Infante, Costas Bassias, Vamsi Korrapati, Kei li.
    AI2: Training a big data machine to defend
    IEEE BDS - 16. IEEE International Conference on Big Data Security on Cloud New York, NY. April 2016.

Theses

  1. Benjamin J. Schreck
    Towards An Automatic Predictive Question Formulation
    M.Eng. Thesis, MIT Dept of EECS, June 2016.

  2. Yonglin Wu
    Model Factory: A New Way to Look at Data Through Models.
    M.Eng. Thesis, MIT Dept of EECS, June 2016.

  3. Sebastien Boyer
    Transfer Learning for Predictive Models in MOOCs.
    S.M. Thesis, MIT Dept of EECS, IDSS, June 2016.

  4. Neha Patki
    The Synthetic Data Vault: Generative Modeling for Relational Databases.
    M.Eng. Thesis, MIT Dept of EECS, June 2016.

  5. Mario Orozco Gabriel
    Articial Intelligence Opportunities and an End-To-End Data-Driven Solution for Predicting Hardware Failures
    S.M., MBA thesis, MIT Dept of Mechanical Engineering, Sloan School of Management, June 2016.

2015  

  1. Sebastien Boyer, Ben U. Gelman, Benjamin Schreck, Kalyan Veeramachaneni
    Data Science Foundry for MOOCs
    IEEE DSAA - 15. IEEE/ACM Data Science and Advance Analytics Conference, October 2015.

  2. James Max Kanter, Kalyan Veeramachaneni
    Deep Feature Synthesis: Torwards Automating Data Science Endeavors
    IEEE DSAA - 15. IEEE/ACM Data Science and Advance Analytics Conference (10% acceptance rate), October 2015.

  3. Ignacio Arnaldo, Una-May O’Reilly, Kalyan Veeramachaneni
    Building Predictive Models via Feature Synthesis
    GECCO-15. ACM conference on Genetic and Evolutionary Computation, July 2015.

  4. Kalyan Veeramachaneni, Alfredo Cuesta-Infante, Una-May O’Reilly
    Copula Graphical Models for Wind Resource Estimation
    IJCAI-15. International joint conference on Artificial Intelligence, July 2015.

  5. Franck Dernoncourt*, Kalyan Veeramachaneni, Una-May O’Reilly
    Gaussian Process-based Feature Selection for Wavelet Parameters: Predicting Acute Hypotensive Episodes from Physiological Signals
    CBMS-15. 28th IEEE International Symposium on Computer-Based Medical Systems, June 2015.

  6. Sebastien Boyer, Kalyan Veeramachaneni
    Transfer Learning for Predictive Models in Massive Open Online Courses.
    AIED-15. 17th International Conference on Artificial Intelligence in Education, June 2015.

  7. Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O’Reilly, Saman Amarasinghe
    Autotuning Algorithmic Choice for Input Sensitivity
    36th annual ACM SIGPLAN conference on Programming Language Design and Implementation, June 2015.

  8. Kalyan Veeramachaneni, Una-May O’Reilly, Kiarash Adl
    Feature Factory: Crowd Sourcing Feature Discovery
    L@S-2015. WIP session at ACM Learning @Scale, March 2015.

  9. Ignacio Arnaldo, Kalyan Veeramachaneni, Andrew Song, Una-May O’Reilly
    Bring Your Own Learner! A cloud-based, data-parallel commons for machine learning
    IEEE Computational Intelligence Magazine. Special Issue on Computational Intelligence for Cloud Computing (February 2015).

Theses

  1. Kevin Wu
    Deep Tuner: A System for Search Technique Recommendation in Program Autotuning. Prof. Saman Amarasinghe
    M.Eng. thesis, MIT Dept of EECS, August 2015.

  2. Max Kanter, 2015
    The Data Science Machine: Emulating Human Intelligence in Data Science Endeavors

  3. Bryan Collazo
    Machine Learning Blocks
    M.Eng. Thesis, MIT Dept of EECS, June 2015.

  4. Michael Wu, 2015.
    The Synthetic Student: A Machine Learning Model to Simulate MOOC Data

  5. Alex Wang, 2015
    Feature Factory: A collaborative, Crowd-Sourced Machine Learning System

  6. Edwin Zhang, 2015
    Image Miner: An Architecture to Support Deep Mining of Images

2014  

  1. Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O’Reilly and Saman Amarasinghe
    OpenTuner: an extensible framework for program autotuning
    ACM 23rd International Conference on Parallel Architectures and Compilation, August 2014.

  2. Colin Taylor*, Kalyan Veeramachaneni, Una-May O’Reilly, arXiv report
    Likely to stop? Predicting Stopout in Massive Open Online Courses
    Preprint. 14 Aug 2014.

  3. Dennis Wilson, Sylvain Cussat-Blanc, Kalyan Veeramachaneni, Hervé Luga, Una-May O’Reilly
    A continuous developmental model for wind farm layout optimization
    ACM conference on Genetic and Evolutionary Computation, July 2014.

  4. Kalyan Veeramachanenif, Una-May O’Reilly, Colin Taylor, arXiv report
    Towards Feature Engineering at Scale for Data from Massive Open Online Courses
    Preprint. 20 July 2014.

  5. Ignacio Arnaldo, Kalyan Veeramachaneni and Una-May O’Reilly
    Flash: A GP-GPU Ensemble Learning System for handling Large Datasets
    17th European Conference on Genetic Programming, April 2014.

  6. Una-May O’Reilly, Kalyan Veeramachaneni
    Technology for Mining the Big Data of MOOCs
    Research and Practice in Assessment, Winter 2014.

  7. Kalyan Veeramachaneni, Ignacio Arnaldo, Owen Derby, Una-May O’Reilly
    FlexGP: Cloud-Based Ensemble Learning with Genetic Programming for Large Regression Problems
    Journal Of Grid Computing.

Theses

  1. Will Drevo 2014
    Delphi: A Distributed Multi-algorithm, Multi-user, Self Optimizing Machine Learning System
    (This thesis was filed as a patent and is pending release)

  2. Franck Dernoncourt (S. M) 2014
    BeatDB: An end-to-end approach to unveil saliencies from massive signal data sets.

  3. Quentin Agren (Visiting student) 2014
    From Click Stream to Learning Trajectories, Bridging OpenEdx and MOOCdb

  4. Vineet Gopal 2014
    PhysioMiner: A Scalable Cloud Based Framework for Physiological Waveform Mining

  5. Colin Taylor 2014
    Stopout Prediction in Massive Open Online Courses

  6. Elaine Han 2014
    Modeling Problem Solving in Massive Open Online Courses

2013  

  1. Monica Vitali, Una-May O’Reilly, and Kalyan Veeramachaneni
    Modeling Service Execution on Data Centers for Energy Efficiency and Quality of Service Monitoring
    IEEE International Conference on Systems, Man and Cybernetics, October 2013.

  2. Ignacio Arnaldo, Kalyan Veeramachaneni and Una-May O’Reilly.
    Analyzing Millions of Submissions to Help MOOC instructors Understand Problem Solving
    NIPS Workshop on Big Learning, August 2013.

  3. Franck Dernoncourt*, Kalyan Veeramachaneni and Una-May O’Reilly
    beatDB : A Large Scale Waveform Feature Repository
    NIPS Workshop on Machine Learning for Clinical Data Analysis and Healthcare, August 2013.

  4. Ignacio Arnaldo, Kalyan Veeramachaneni and Una-May O’Reilly
    Building MultiClass Nonlinear Classifiers with GPUs
    NIPS Workshop on Big Learning, August 2013.

  5. Kalyan Veeramachaneni, Teasha Feldman-Fitzthum, Una-May O’Reilly, Alfredo Cuesta-Infante
    Copula-Based Wind Resource Assessment
    NIPS Workshop on Machine Learning for Sustainability, August 2013.

  6. Franck Dernoncourt*, Choung Do, Sherif Halawa, Una-May O’Reilly, Colin Taylor, Kalyan Veeramachaneni and Sherwin Wu
    MoocViz: A Large Scale, Open Access, Collaborative Data Analytics Framework for MOOCs
    NIPS workshop on Data Directed Education, August 2013.

  7. Kalyan Veeramachaneni, Zachary A. Pardos, Una-May O’Reilly
    MOOCdb: Developing Data Standards for MOOC Datascience
    MOOCShop at Artificial Intelligence in Education, July 2013.

  8. Kalyan Veeramachaneni, Owen Derby, Dylan Sherry, Una-May O’Reilly
    Learning regression ensembles with genetic programming at scale
    Proceeding of the fifteenth ACM annual conference on Genetic and evolutionary computation conference, July 2013.

  9. Dennis Wilson*, Emmanuel Awa, Sylvain Cussat-Blanc, Kalyan Veeramachaneni, Una-May O’Reilly
    On Learning to Generate Wind Farm Layouts
    Proceeding of the fifteenth ACM annual conference on Genetic and evolutionary computation conference, July 2013.

  10. Alexander Waldin*, Kalyan Veeramachaneni, Una-May O’Reilly
    Learning Blood Pressure Behavior from Large Physiological Waveform Repositories
    ICML Workshop on Healthcare, June 2013.

  11. Dennis Wilson*, Kalyan Veeramachaneni, and Una-May O’Reilly
    Cloud Scale Distributed Evolutionary Strategies for High Dimensional Problems
    Applications of Evolutionary Computation, Lecture Notes in Computer Science.

  12. Owen Derby*, Kalyan Veeramachaneni, and Una-May O’Reilly
    Cloud Driven Design of a Distributed Genetic Programming Platform
    Applications of Evolutionary Computation, Lecture Notes in Computer Science.

  13. Erik Hemberg, Constantin Berzan*, Kalyan Veeramachaneni, Una-May O’Reilly
    Introducing Graphical Models to Analyze Genetic Programming Dynamics
    Proceedings of the twelfth workshop on Foundations of genetic algorithms, January 2013.

Theses

  1. Exploiting multiple levels of parallelisms in FlexGP for big data sets Dylan Sherry

  2. Owen Derby, 2013
    FlexGP: a Scalable System for Factored Learning in the Cloud

  3. Alex Waldin, 2013
    Learning Blood Pressure Behavior From Large Blood Pressure Waveform Repositories and Building Predictive Models

  4. Chidube Ezeozue, 2013
    Large-scale Consensus Clustering and Data Ownership Considerations for Medical Applications

  5. Josh Ingram, 2012
    [a]sorted Selection: Improving Building Performance and Diversity Using a New Form of Interactive Evolutionary Algorithm

  6. Danielle Ramazotti
    An Observational Study: The Affect of Diuretics Administration on Outcomes of Mortality and Mean Duration of I.C.U. Stay