| Page 658 | Kisaco Research

Senior Director of R&D, Siemens EDA

This is a single event that attracts all the top thought leaders and practitioners who contribute to the advancement of AI related technologies. From an EDA perspective we get to hear the challenges of building software to optimize the HW architectures and from the HW the challenges of building designs that are fast, compact, and thrifty with power. There are many compelling AI innovations being discussed and they are all connected.

Section Manager, Tokyo Electron

I had the privilege of attending the event, and it was truly an unforgettable experience. From the moment I walked in, I was greeted with warmth and enthusiasm. The organization and execution were impeccable, making it clear that a tremendous amount of effort and dedication had gone into planning every aspect. The speakers were not only knowledgeable but also engaging, leaving the audience inspired and motivated. The topics covered were not only relevant but thought-provoking, sparking meaningful conversations among attendees.

CEO, Efabless Corporation

It's a great conference including the entire innovation chain from chips and software to systems and end-users. The Gen AI sessions were particularly interesting as we track from ground zero the emergence of this important trend.

CEO and Co-Founder, NeuEdge Ltd

The summit has been an unparalleled event for meeting brilliant people working at every stage of the AI hardware stack. As a founder of an AI hardware startup, I was grateful for the willingness of all attendees to share their insights, and I left with many new friends and connections.

Senior Full Stack Data Scientist, Capital One

I personally though this was a great mix of hardware companies and software companies coming together to solve cutting-edge problems. As an attendee - there was a lot of useful learning.

 

Dylan Patel

Chief Analyst
Semi Analysis

Dylan Patel

Chief Analyst
Semi Analysis

Dylan Patel

Chief Analyst
Semi Analysis

Author:

Xavier Soosai

Chief Information Officer
Center for Information Technology/National Institute of Health

As the Director of the Office of Information Technology Services of the Center for Information Technology (CIT), Soosai oversees ten service areas and the delivery of scientific research and business operations across the institutes and centers (ICs) at NIH. This includes maintaining the high-performance computing environment used by NIH intramural scientists; maintaining NIH’s secure, high-speed network; ensuring the viability and availability of collaboration services, compute hosting and storage services, identity and access management services, service desk support, and more for the NIH community. 

Soosai works with CIT leadership and internal service area managers and collaborates with NIH ICs to define scope and provide technical expertise, strategic planning, and leadership for local and enterprise IT projects that drive efficiency and innovation across NIH. Additionally, Soosai is responsible for directing the evaluation and adoption of rapidly evolving technology and forecasting future technology needs.

 

Xavier Soosai

Chief Information Officer
Center for Information Technology/National Institute of Health

As the Director of the Office of Information Technology Services of the Center for Information Technology (CIT), Soosai oversees ten service areas and the delivery of scientific research and business operations across the institutes and centers (ICs) at NIH. This includes maintaining the high-performance computing environment used by NIH intramural scientists; maintaining NIH’s secure, high-speed network; ensuring the viability and availability of collaboration services, compute hosting and storage services, identity and access management services, service desk support, and more for the NIH community. 

Soosai works with CIT leadership and internal service area managers and collaborates with NIH ICs to define scope and provide technical expertise, strategic planning, and leadership for local and enterprise IT projects that drive efficiency and innovation across NIH. Additionally, Soosai is responsible for directing the evaluation and adoption of rapidly evolving technology and forecasting future technology needs.

 

Emerging Memory Innovations
Systems Infrastructure/Architecture
HBM/CXL

Author:

SangJoon Hwang

Corporate EVP, Head of DRAM Product & Technology Samsung Electronics
Samsung Electronics

SangJoon Hwang received B.S, M.S., and Ph.D. degrees in electric engineering from the Korea University in 1994, 1996, and 2008, respectively.

He joined the Samsung Electronics, Hwaseong, South Korea in 1996, where he had successfully led a DRAM design group in 2014 and the Flash design team in 2017 as a Vice President and the Memory Product Planning team in 2019 as as a Senior Vice President. Through leading various backgounds from product planning to design, his experience enhances the overall quality of Samsung DRAM products.

 

Since 2023, he has been leading the DRAM Product & Technology of the Samsung memory division. His current research interests include architecture for next-generation DRAM and product development utilizing new process technology for new product line-up.

SangJoon Hwang

Corporate EVP, Head of DRAM Product & Technology Samsung Electronics
Samsung Electronics

SangJoon Hwang received B.S, M.S., and Ph.D. degrees in electric engineering from the Korea University in 1994, 1996, and 2008, respectively.

He joined the Samsung Electronics, Hwaseong, South Korea in 1996, where he had successfully led a DRAM design group in 2014 and the Flash design team in 2017 as a Vice President and the Memory Product Planning team in 2019 as as a Senior Vice President. Through leading various backgounds from product planning to design, his experience enhances the overall quality of Samsung DRAM products.

 

Since 2023, he has been leading the DRAM Product & Technology of the Samsung memory division. His current research interests include architecture for next-generation DRAM and product development utilizing new process technology for new product line-up.

Author:

Yang Seok Ki

CXL Board of Director, VP and CTO of Memory Solutions Lab
Samsung Electronics

Dr. Yang Seok Ki is a Vice President and CTO of the Memory Solutions Lab (MSL) at Samsung Semiconductor Inc. in San Jose, California. Since joining Samsung in 2011, he has led various advanced development projects including SmartSSD, Key-Value SSD, CXL Memory Expander, Memory Semantic SSD, etc. In addition, he led the NVMe Key Value Standard, SNIA Key Value API, SNIA Computational Storage Architecture and API. He is a CXL board of director and a technical chair of Data Centric Computing workstream of the Open Computing Project (OCP) Future Technology Initiative (FTI). Prior to joining Samsung, he worked for Oracle's Server Technology Group. Prior to his industrial career, he was involved in High Performance Computing (HPC), Grid Computing, and Cloud research at the Institute of Information Sciences at the University of Southern California and the Center for Networked Systems at the University of California, San Diego. He received his Ph.D. in Electrical Engineering and Computer Engineering from Seoul National University, and his Master's and Bachelor's degrees in Computer Engineering from Seoul National University. He also completed the Engineering Leadership Professional Program (ELPP) from the University of California, Berkeley.

Yang Seok Ki

CXL Board of Director, VP and CTO of Memory Solutions Lab
Samsung Electronics

Dr. Yang Seok Ki is a Vice President and CTO of the Memory Solutions Lab (MSL) at Samsung Semiconductor Inc. in San Jose, California. Since joining Samsung in 2011, he has led various advanced development projects including SmartSSD, Key-Value SSD, CXL Memory Expander, Memory Semantic SSD, etc. In addition, he led the NVMe Key Value Standard, SNIA Key Value API, SNIA Computational Storage Architecture and API. He is a CXL board of director and a technical chair of Data Centric Computing workstream of the Open Computing Project (OCP) Future Technology Initiative (FTI). Prior to joining Samsung, he worked for Oracle's Server Technology Group. Prior to his industrial career, he was involved in High Performance Computing (HPC), Grid Computing, and Cloud research at the Institute of Information Sciences at the University of Southern California and the Center for Networked Systems at the University of California, San Diego. He received his Ph.D. in Electrical Engineering and Computer Engineering from Seoul National University, and his Master's and Bachelor's degrees in Computer Engineering from Seoul National University. He also completed the Engineering Leadership Professional Program (ELPP) from the University of California, Berkeley.

Large Language Models (LLMs) have revolutionized natural language processing but have posed significant challenges in training and inference due to their enormous memory requirements. In this talk, we delve into techniques and optimizations to mitigate memory constraints across the entire lifecycle of LLMs.

The first segment explores Memory Optimized LLM Training. We discuss Training challenges and cover different techniques under Parameter Efficient Fine Tuning (PEFT). like prompt tuning with LoRA, and adapters.

LLMs inference is more memory bound rather than compute bound, In this section we will explore inference optimizations mostly for transformer architectures like Paged Key-Value (KV) Cache, Speculative Decoding, Quantization, Inflight Batching strategies, Flash Attention, each contributing to enhanced inference speed and efficiency.

Finally, we explore the concept of Coherent Memory, and how it helps with Inference optimizations by KV Cache offloading and LoRA weight re-computation.

By illuminating these advancements, this talk aims to provide a comprehensive understanding of state-of-the-art memory optimization techniques for LLMs, empowering practitioners to push the boundaries of natural language processing further.

Systems Infrastructure/Architecture
AI/ML Compute

Author:

Arun Raman

Deep Learning Solutions Architect
NVIDIA

Arun Raman is an AI solution architect at NVIDIA, adept at navigating the intricate challenges of deploying AI applications across edge, cloud, and on-premises environments within the consumer Internet industry. In his current role, he works on the design of end-to-end accelerated AI pipelines, for consumer internet customers meticulously addressing preprocessing, training, and inference optimizations.  His experience extends beyond AI, having worked with distributed systems and multi-cloud infrastructure. He shares practical strategies and real-world experiences, empowering organizations to leverage AI effectively.

Arun Raman

Deep Learning Solutions Architect
NVIDIA

Arun Raman is an AI solution architect at NVIDIA, adept at navigating the intricate challenges of deploying AI applications across edge, cloud, and on-premises environments within the consumer Internet industry. In his current role, he works on the design of end-to-end accelerated AI pipelines, for consumer internet customers meticulously addressing preprocessing, training, and inference optimizations.  His experience extends beyond AI, having worked with distributed systems and multi-cloud infrastructure. He shares practical strategies and real-world experiences, empowering organizations to leverage AI effectively.