Do Machine Learning Models Produce TypeScript Types that Type Check?
Type migration is the process of adding types to untyped code to gain assurance at compile time. TypeScript and other gradual type systems facilitate type migration by allowing programmers to start with imprecise types and gradually strengthen them. However, adding types is a manual effort and several migrations on large, industry codebases have been reported to have taken several years. In the research community, there has been significant interest in using machine learning to automate TypeScript type migration. Existing machine learning models report a high degree of accuracy in predicting individual TypeScript type annotations. However, in this paper we argue that accuracy can be misleading, and we should address a different question: can an automatic type migration tool produce code that passes the TypeScript type checker?
We present TypeWeaver, a TypeScript type migration tool that can be used with an arbitrary type prediction model. We evaluate TypeWeaver with three models from the literature: DeepTyper, a recurrent neural network; LambdaNet, a graph neural network; and InCoder, a general-purpose, multi-language transformer that supports fill-in-the-middle tasks. Our tool automates several steps that are necessary for using a type prediction model, including (1) importing types for a project’s dependencies; (2) migrating JavaScript modules to TypeScript notation; (3) inserting predicted type annotations into the program to produce TypeScript when needed; and (4) rejecting non-type predictions when needed.
We evaluate TypeWeaver on a dataset of 513 JavaScript packages, including packages that have never been typed before. With the best type prediction model, we find that only 21% of packages type check, but more encouragingly, 69% of files type check successfully.
Slides (speaker notes) (ecoop2023notes.pdf) | 314KiB |
Slides (ecoop2023.pdf) | 629KiB |
Thu 20 JulDisplayed time zone: Pacific Time (US & Canada) change
13:30 - 15:00 | ECOOP 5: SynthesisResearch Papers at Habib Classroom (Gates G01) Chair(s): Karine Even-Mendoza King’s College London | ||
13:30 15mTalk | Synthesis-Aided Crash Consistency for Storage Systems Research Papers Jacob Van Geffen Veridise Inc., James Bornholt University of Texas at Austin, Emina Torlak Amazon Web Services and University of Washington, Xi Wang University of Washington DOI | ||
13:45 15mTalk | Synthesizing Conjunctive Queries for Code Search Research Papers Chengpeng Wang Hong Kong University of Science and Technology, Peisen Yao Zhejing University, Wensheng Tang Hong Kong University of Science and Technology, Gang Fan Ant Group, Charles Zhang Hong Kong University of Science and Technology DOI | ||
14:00 15mTalk | Hoogle⋆: Constants and λ-abstractions in Petri-net-based Synthesis using Symbolic Execution Research Papers Henrique Botelho Guerra INESC-ID and IST, University of Lisbon, João F. Ferreira INESC-ID and IST, University of Lisbon, João Costa Seco NOVA-LINCS; Nova University of Lisbon DOI | ||
14:15 15mTalk | Building Code Transpilers for Domain-Specific Languages Using Program Synthesis Research Papers Sahil Bhatia University of California, Berkeley, Sumer Kohli UC Berkeley, Sanjit Seshia UC Berkeley, Alvin Cheung University of California at Berkeley DOI | ||
14:30 15mTalk | Do Machine Learning Models Produce TypeScript Types that Type Check? Research Papers DOI Media Attached File Attached | ||
14:45 15mTalk | Toward Tool-Independent Summaries for Symbolic Execution Research Papers Frederico Ramos Instituto Superior Técnico, Nuno Sabino Instituto Superior Técnico, Carnegie Mellon University, Pedro Adão IST-ULisboa and Instituto de Telecomunicações, David Naumann Stevens Institute of Technology, José Fragoso Santos INESC-ID/Instituto Superior Técnico, Portugal DOI |