In September 2014, I came on board as the most junior member of the Computation Structures Group at MIT.
synchronization on tardis
Tardis is a highly scalable cache coherence protocol that avoids the linear increase in storage needed by traditional directory-based protocols by using timestamps. In 2016, I implemented Tardis on a RISC-V system to examine how synchronization might work on such a system. This became my Master's thesis.
We plan to demonstrate that Tardis is practical at scale by refining its implementation on a RISC-V system. We plan to build a multi-FPGA system to achieve a core count in the hundreds, if not thousands, all fully coherent through the Tardis protocol.
In June 2012, I formally joined the Parallel Computing Laboratory at UC Berkeley. It has since morphed into the ASPIRE project, and I continued my work with this group until I graduated in May 2014.
- I worked with a partner to complete a port of the Linux kernel to the RISC-V architecture, a new ISA designed at Berkeley.
- I described simple processor pipelines with Chisel, Berkeley's new hardware description language, embedded in Scala.
- I practiced parallelizing programs with the Maven Vector-Thread architecture.
In the Fall 2013 semester, ~aou and I partnered for CS 250, a graduate VLSI design course. Our term project involved the expansion of Hwacha, a decoupled vector-fetch data-parallel accelerator, to support more efficient execution of packed floating-point numbers. When registers can support double-precision floating-point numbers, but the numbers used in computation are in single-precision or half-precision, packing can improve the efficiency. We added additional fused-multiply-add (FMA) units and augmented Hwacha instruction sequencing to accomplish our goals.
In the Spring 2014 semester, we returned to where we left off in CS 250 to add mixed-precision computation to Hwacha. Instead of specifying a global precision, we now specify the number of each type of register (double, single, half, and integer) and Hwacha will automatically allocate as many as are needed. In the process, we completely rewrote the Vector Memory Unit (VMU), and expanded the logic needed to operate chaining correctly.
Our reports will be coming soon. Other online resources on Hwacha are below:
In mid-2012, I collaborated with ~aou to spearhead the effort of bringing Linux to RISC-V. As part of this work, I:
- wrote (and subsequently fixed) the context switch assembly code,
- created a block device driver,
riscv-linux-gcc(the RISC-V cross-compiler),
- dabbled with the Linux virtual memory subsystem, and
- twiddled bits in many other places.
- Linux/RISC-V installation guide
- RISC-V/Newlib toolchain installation guide
- Unfinished guide to the Linux/RISC-V architectural port
- Outdated RISC-V ABI notes
- [PDF] Kernel Struct Diagram
In November 2017, five years and about six thousand lines of C and RISC-V assembly later, RISC-V became the newest architecture supported by Linux. You can read more about RISC-V at http://www.riscv.org.
berkeley solar drone
Between October 2011 and May 2012, I worked with three other people to compete in the Intel-Cornell Cup USA. As a part of the facetiously named Light Basin Laboratory, we worked towards our goal of building a solar-powered unmanned aerial vehicle (UAV), the Berkeley Solar Drone. I enthusiastically embraced the many facets of this project, including:
- authoring the proposal which earned us a spot in the final competition and $2,500 in funding,
- designing and building the avionics subsystem, featuring the Atmel ATmega328P as the central on-board microcontroller (with the Intel Atom intended as a base station on the ground),
- interfacing to GPS and IMU (accelerometer, gyroscope, magnetometer) via UART and I2C,
- visualizing flight data within a PyGTK application, and
- writing a final technical report and presenting a demonstration to judges at the Intel-Cornell Cup USA competition in Orlando, Florida.
Our code for the avionics systems is available at the
lbl-bsd GitHub repository.