Detail

Department: AI Agent Research Center

Location: Hong Kong / Shenzhen

Experience: Graduate / Early Career

Openings: 10

About the Role:

You will help build learning-capable AI agents that interact with real-world business environments, learn decision policies for pricing/inventory, and optimize behavior through feedback. This is about RL + LLM + Multi-agent coordination in real industrial systems.

Key Focus:

Design agent-environment interaction systems (observations, actions, rewards).
Apply RL to pricing optimization, inventory allocation, and fulfillment scheduling.
Build long-horizon planning and multi-step reasoning pipelines.
Implement preference learning and feedback optimization (RLHF / RLAIF).
Construct simulation environments and offline evaluation pipelines from real business data.

Ideal Experience:

Background in RL, agents, or decision systems; Strong Python & PyTorch.
Ability to abstract real-world problems into states, actions, and rewards.
Nice to have: Multi-agent experience, Game theory, or Supply chain optimization.
Tech Stack: Python, PyTorch, Distributed RL, Agent frameworks.

Please send English CV to : [email protected]

Tags for this job:

Industry

Information Technology
Job Function

Information Technology>Network & System

Information Technology>Systems / Technical Support

Information Technology>Technical / Functional Consulting

Information Technology>Others

Others>Student / Fresh Graduate / No Experience
Location
Salary

HK$ -
Employment Type

Full Time
Benefits

JobMarket for iPhone

JobMarket Publishing L...

Tags for this job: