Tools

Table of Contents

  1. Tools
    1. Background
    2. Enginering Tools
      1. Terminal
      2. Editors, IDES etc.
      3. Testing
      4. Linters and Formatters
      5. Package Managers
      6. Version Control
      7. Containerization
      8. Prototyping
        1. Interactive Prototyping
      9. Python Tools
      10. Audio Tools
      11. File Processing
    3. AI Tools
    4. Writing + Publishing
    5. Signal Processing
      1. Audio
      2. Image
    6. Datasets
    7. Scientific Data Computing
    8. Dimensionality Reduction
    9. Systems
      1. Large Scale Data Processing
      2. Databases
      3. Feature Stores
      4. Compute + Infrastructre
      5. Deployment
      6. AIML Deployment, Interfaces and Infrastructure
    10. Information
      1. Information Retrieval
      2. Language
    11. Machine Learning
      1. Classical Machine Learning
      2. Deep Learning
      3. Distributed Machine Learning
      4. Machine Learning Tools
    12. Language Models
      1. Open Source AI Models
      2. Finetuning Tools
      3. Diffusion Tools
      4. HuggingFace Tools
    13. Visualization
      1. Interactive Visualization
        1. D3 Extensions
    14. Programming Languages
      1. Web - JavaScript and Frameworks
  2. Things I wish I had time to explore and learn more about
    1. Creative Coding
    2. C++

Background

This page is a place where I keep track of what libraries and tools I have used/currently using. This page is inspired by uses this, the missing semester of your cs education, the pragmatic programmer, strangeloop, eyeo festival.

I owe most of my engineering knowledge to the following people: Tommy Ngyuen, Sundar Rajan, Ron Minnich, James Wexler, Ryan Stutsman, Jakob Johnson, Peter Jensen, Jeff Phillips and Varun Shankar whose teaching, office hours, and conversations gave me a foundation for not only understanding how to debug and navigate all levels of the technical stack but also core engineering practices - environment/tool setup, experimentation and monitoring setup, command line hacks, numerical programming, and the mental fortitutde you need for finding and fixing bugs.

In the age of chatgpt and llms their advice still holds - the only way to become a better engineer is to spend countless hours debugging, experimenting and writing code. When it comes to UIs, Webdev and even visualization engineering vibecoding gets the 90% of the job done(at least from what i’ve seen and tried). The last 10% - that’s the human-creative part that AI can’t replace.

Software engineering is not dead - the mission critical and important software we use every day in health, driving and more is implemented with C++ and other technologies that have been battled tested.

Note: This page is getting constantly updated - Last update: february 13, 2026

Enginering Tools

Terminal

Editors, IDES etc.

  • vscode, opensource version: vscodium. GitHub copilot and its autocomplete is such a nuisance that I have turned it off. If I need AI help, Gemini/Claude is more useful.
  • vim
  • intellij

Testing

Linters and Formatters

Package Managers

Version Control

Containerization

Prototyping

Interactive Prototyping

Python Tools

Audio Tools

File Processing

AI Tools

Writing + Publishing

Signal Processing

Audio

Image

Datasets

Scientific Data Computing

Dimensionality Reduction

Systems

Large Scale Data Processing

  • pyspark
  • polars -> substitute for pandas when working with parquet and giant data(non-csv)

Databases

Feature Stores

Compute + Infrastructre

Deployment

AIML Deployment, Interfaces and Infrastructure

Information

Information Retrieval

Language

Machine Learning

Classical Machine Learning

Deep Learning

Distributed Machine Learning

Machine Learning Tools

Language Models

Open Source AI Models

Finetuning Tools

Diffusion Tools

HuggingFace Tools

Visualization

The python visualization ecosystem is fragmented and choosing the right visualization library depends on the project and audience. Many of the python visualization libraries build off of matplotlib. Matplotlib while tedious is super power powerful and versatile, with the ability to render interactive 3d plots, conic sections and do image processing work. Seaborn is great for statistical charts with an aesthetic similar to ggplot but the syntax can get gnarly. Altair is an easy to learn member of the Vega-Lite ecosystem but requires deep knowledge of the grammar of graphics and gets knarly with interaction and customizability.

Interactive Visualization

D3 - a library that has caused many ups and down for me but still I am excited about its capabilities. This is highly opinionated but D3 is done. Svelte, JavaScript, charting libraries(chart.js) and frontend engineering have slowly taken and improved D3’s magic to the point that you can actually implement many features with the Canvas API, CSS, and frameworks. D3 was built for the 2009-2021 era but now the api design seems stuck in time as it clashes with frameworks that need to utilize its capabilities. To that end, with some engineering tricks and staring at the source code it is possible to integrate d3 into a framework project but it is now treated more as a module library.

At the time of this writing, d3’s capabilities are unmatched for interactive visualization thus d3 is still needed for interactive visualization and building interfaces that interface between users and ml models.

D3 Extensions

Programming Languages

  • Python - machine learning, scripting, prototyping and almost all of my development these days
  • Bash - compile and running programs, experiments etc.
  • Java - information retrieval, data engineering

Web - JavaScript and Frameworks

Not learning a JS framework was the biggest regret I had from my Utah days. For choosing a framework, it really depends on how well you know javascript and your project goals (see the framework documentaries in the talks page). Since I’m interested in interactive visualization for machine learning, svelte seems to be the most widely adopted based on the tools and research published by Anthropic, Apple AIML Google DeepMind PAIR and OpenAI hiring Jay Wang. I wouldn’t count out vue and react because Catherine Yeh built AttentionVis with vue and the Polo Club for Data Science publishes a wide range of ml-vis tools using svelte, vue and react.

Things I wish I had time to explore and learn more about

These are things that I wish I had time to explore

Creative Coding

I first heard about this at the EYEO Festival. I started with Dan Shiffman’s Coding Train but shifted to Kevin Workman’s Happy Coding and then stopped. It was a lot of fun to use p5 and it explained some of the javascript quirks I struggled with. After seeing all the cool stuff put out by Martin Wattenberg, Fernanda Viegas, Golan Levin, Ravi Chugh, and Andrew McNutt I want to pick it up again and try to implement some of the projects from Code as a Creative Medium.

C++

I wish I had time to explore C++ more in depth because this is the language for turning research projects and code into production and tools that people actually use especially for my interests in audio, vision, graphics are in C++.