AI演算法工程師/AI Algorithm Engineer(竹南/台北)

Job updated about 1 month ago
The employer was active 26 days ago

Job Description

負責分析AI算法在實際系統架構運行時的效能瓶頸。

AI 應用成為近幾年的主流,軟體服務的效能依賴於幾個重點:系統架構設計、演算法的實作、硬體規格。

過去為了提升協作開發效率,計算機領域主要朝著抽象化的方式,降低各層互相的耦合性。

然而,這其中的權衡奠基在過去硬體隨著摩爾定律效能快速的成長,在大型系統中往往能藉由過剩的硬體效能彌補抽象化造成的損失。 然而,AI 時代迎來模型參數和運算需求指數成長的情況,這是少數已經具有商業價值需求的算法,超出現有硬體算力需求的狀況。 所以,以整體作為優化的研究,成為當前相當具有價值的工作。縱向的打破各層之間的抽象,找出不同實作之間能獲得最佳效能的搭配。

1. 基於演算法(多為平行運算)估算運算複雜度和空間使用量

2. 基於硬體配置,估算演算法在系統上的理論表現

3. 設計 DOE 實驗,驗證實際數據與理論數據的誤差

4. 基於實驗結果,找出演算法、軟體實作、系統、硬體的瓶頸

5. 實作優化算法,藉由實驗結果和理論分析證明其有效性

6. 撰寫技術文件,提供應用服務端作為實作參考。

The responsibility involves analyzing performance bottlenecks of AI algorithms during operation within actual system architectures.

AI applications have become mainstream over recent years. The performance of software services relies on several key factors: system architecture design, algorithm implementation, and hardware specifications.

In the past, to improve collaborative development efficiency, the computing field primarily focused on abstraction to reduce coupling between different layers.

However, this trade-off was based on the rapid growth in hardware performance driven by Moore’s Law. In large systems, excess hardware capabilities often compensated for losses caused by abstraction. However, the AI era brings exponential growth in model parameters and computational demands. This situation occurs with only a few algorithms that already possess commercial value requirements, exceeding existing hardware computing capabilities. Therefore, research focusing on overall optimization has become highly valuable work. Breaking down vertical abstractions between layers to identify optimal combinations among different implementations becomes essential.

  1. Estimating computational complexity and space usage based on algorithms (mostly parallel computing)
  2. Estimating theoretical performance of algorithms on systems based on hardware configurations
  3. Designing DOE experiments to verify discrepancies between actual and theoretical data
  4. Identifying bottlenecks in algorithms, software implementation, systems, and hardware based on experimental results
  5. Implementing optimized algorithms, demonstrating their effectiveness through experimental results and theoretical analysis
  6. Writing technical documentation to provide application service providers with implementation references.

Requirements

Basic:

1. 具備3年以上程式、系統開發經驗者尤佳

2. 英文程度需求:

-能閱讀領域論文、技術部落格

-能聽懂技術發表或演討會影音

3. 具備以下程式語言能力:C/C++/Python

4. 具備以下至少一種AI Framework:pytorch/tensorflow /ggml/numpy

5. 熟悉至少一種AI相關演算法及模型:CNN/Transformer-based/others

6. 熟悉費米估計、量綱分析或其他類似的估算技巧

7. 了解至少三種AI 服務的運作流程:

-Training: data parallelism, ZeRO, LoRA, GRPO,…

-Inference: tensor parallelism, pipeline parallelism, kv cache reuse, speculative decoding…

-Model quantization: GPTQ, AWQ, …

Nice to have:

1. 熟悉 GPU Kernel語言(e.g. Triton-lang/CUDA/ROCM/SYCL/VULKAN)

2. 熟悉 LLM 推理框架 (e.g. vLLM/SGLang/llama.cpp/ollama/ktransformer/TensorRT…)

3. 熟悉作業系統底層運作原理

4. 其它相關技術 Vector search algorithms, Distributed file system, GPUDirect, RDMA, …

1
1 years of experience required
50,000 ~ 100,000 TWD / month
Personal Invitation Link
This is your personal referral link for job invitation. You'll receive an email notification when someone applied for the position via your job link.
Share this job
People who applied for this job also applied for
Full-time
Entry level
2
50K ~ 100K TWD / month
Full-time
Entry level
1
50K ~ 90K TWD / month
Full-time
Entry level
1
50K ~ 90K TWD / month
Contract
Assistant
1
570 ~ 640 TWD / salary_type.null
Full-time
Mid-Senior level
1
800K ~ 1.1M TWD / month
Full-time
Entry level
4
1.1M ~ 1.6M TWD / year

About us

群聯電子是全球最大的獨立快閃記憶體控制晶片供應商,目前設計的產品包括USB隨身碟、SD記憶卡、eMMC、UFS、PATA、SATA與PCIe SSD等快閃記憶體控制晶片,每年在全球銷售超過6億顆控制晶片IC,年營收達到22億美元,目前市值約新台幣1000億元,為全台第四大的IC設計公司。

群聯所專注的快閃記憶體 (NAND Flash) 應用市場為半導體產業中成長速度最快的領域之一,例如消費市場、工業應用、嵌入式系統、行動裝置、電競主機、車用電子系統、以及伺服器等,均需要大量的快閃記憶體控制晶片以及儲存產品,市場規模與群聯的未來成長潛力無窮。