Department: AI Agent Research Center
Location: Hong Kong / Shenzhen
Experience: Graduate / Early Career
Openings: 10
About the Role:
You will help build learning-capable AI agents that interact with real-world business environments, learn decision policies for pricing/inventory, and optimize behavior through feedback. This is about RL + LLM + Multi-agent coordination in real industrial systems.
Key Focus:
Design agent-environment interaction systems (observations, actions, rewards).
Apply RL to pricing optimization, inventory allocation, and fulfillment scheduling.
Build long-horizon planning and multi-step reasoning pipelines.
Implement preference learning and feedback optimization (RLHF / RLAIF).
Construct simulation environments and offline evaluation pipelines from real business data.
Ideal Experience:
Background in RL, agents, or decision systems; Strong Python & PyTorch.
Ability to abstract real-world problems into states, actions, and rewards.
Nice to have: Multi-agent experience, Game theory, or Supply chain optimization.
Tech Stack: Python, PyTorch, Distributed RL, Agent frameworks.
Please send English CV to : [email protected]
Information Technology>Network & System
Information Technology>Systems / Technical Support
Information Technology>Technical / Functional Consulting
Information Technology>Others
Others>Student / Fresh Graduate / No Experience
HK$ -
Full Time