VP Cloud & Infrastructure – AWS
Infrastructure
Remote job
Job description
At Toku, we create bespoke cloud communications and customer engagement solutions to reimagine customer experiences for enterprises. We provide an end-to-end approach to help businesses overcome the complexity of digital transformation and deliver mission-critical CX through cloud communication solutions. Toku combines local strategic consulting expertise, bespoke technology, regional in-country infrastructure, connectivity, and global reach to serve the diverse needs of enterprises operating at scale. Headquartered in Singapore, Toku supports customers across APAC and beyond, with a growing footprint across global markets.
This is a senior leadership role responsible for owning and scaling Toku’s cloud and infrastructure platform, with AWS at its core. You will drive reliability, security, scalability, and operational excellence across a globally distributed, mission-critical environment, while building strong processes and leading a growing infrastructure team. This role is both strategic and operational, requiring strong leadership combined with the ability to guide hands-on technical direction. You will be a great fit for this role if you can build structure, lead teams, and elevate infrastructure maturity in a fast-growing environment.
What you will be doing
-
Cloud architecture & platform strategy: Define and own the long-term cloud infrastructure strategy with AWS as the primary platform, ensuring scalability, resilience, and alignment with business growth.
-
Highly available systems design: Design and oversee fault-tolerant, multi-region, and multi-environment (production, staging, disaster recovery) architectures supporting mission-critical systems.
-
AWS platform ownership: Own and standardise AWS architecture, account structure, networking, IAM, and core infrastructure patterns across environments.
-
Infrastructure reliability & performance: Drive capacity planning, performance optimisation, and reliability engineering to support low-latency, high-throughput workloads.
-
Operational excellence & incident management: Own uptime, SLAs and SLOs, and lead incident response, root cause analysis, and continuous improvement of system resilience.
-
Security & compliance leadership: Ensure strong security posture across infrastructure, implementing security-by-design principles and supporting compliance initiatives such as ISO certifications.
-
Cloud cost optimisation (FinOps): Drive cost governance, budgeting, and forecasting, ensuring efficient resource utilisation without compromising reliability.
-
Engineering & services collaboration: Partner with Engineering and Services teams to enable reliable deployments, improve CI/CD pipelines, and support customer onboarding and production readiness.
-
Infrastructure modernisation: Lead adoption of containerisation, Infrastructure as Code, and automation-first practices across the platform.
-
Process & governance: Establish and improve infrastructure processes, standards, change management practices, and operational playbooks.
-
Documentation & operational readiness: Ensure runbooks, documentation, and operational procedures are maintained and consistently followed.
-
Leadership & team development: Lead, mentor, and scale infrastructure, cloud, and platform engineering teams, driving accountability and high performance.
-
Cross-functional influence: Act as a key stakeholder in architectural reviews and collaborate with senior leadership to shape infrastructure direction.
We’d love to hear from you if you have
-
Experience level: 10+ years of experience in infrastructure, cloud, or platform engineering roles, with demonstrated progression into leadership positions.
-
Team leadership: Proven experience managing and scaling infrastructure or cloud engineering teams, including mentoring and performance management.
-
AWS expertise (core requirement): Extensive hands-on experience with AWS and cloud-native architectures in production environments.
-
AWS services ownership: Deep ownership of AWS services including (but not limited to) EC2, ECS/EKS, ALB/NLB, VPC, IAM, S3, RDS, DynamoDB, CloudFront, Route53, KMS, CloudWatch, and security services.
-
Scalable systems: Strong background in building and operating highly scalable, reliable, and secure production systems.
-
Infrastructure as Code & automation: Hands-on experience with Terraform or similar tools, and driving automation-first infrastructure practices.
-
Containers & deployment models: Strong understanding of containerisation, orchestration, and modern deployment patterns.
-
Observability: Experience designing and operating monitoring, logging, alerting, and observability frameworks.
-
Security & compliance: Solid understanding of cloud security, IAM, encryption, and experience supporting compliance or audit processes (e.g., ISO 27001).
-
Engineering collaboration: Experience working closely with engineering and services teams to improve deployment quality and operational reliability.
-
Process-driven leadership: Proven ability to introduce and enforce engineering processes, standards, and operational discipline.
-
Cost optimisation: Experience managing cloud costs, budgeting, and FinOps practices.
-
Domain experience (preferred): Experience supporting large-scale, customer-facing SaaS or enterprise platforms.
-
Location: This role is to be based in Malaysia – KL preferred. It will operate on a mostly WFH basis for the time being, but in the future may require hybrid model WFH / WFO in our KL Sentral based office.
What would you get?
Training and Development
Discretionary Yearly Bonus & Salary Review
Healthcare Coverage based on location
15 days Paid Annual Leave, plus other leave allowances
Toku has been recognised as a LinkedIn Top Startup and by the Financial Times as one of APAC’s Top 500 High Growth Companies. If you’re looking to be part of a company on a strong growth trajectory while working on meaningful, real-world challenges, we’d love to hear from you.