Location: Canada
Date published: 20-May-2025
Job ID: 144577
Location: Canada
Date published: 20-May-2025
Job ID: 144577
Work Location: Hybrid
Position Type: Contract, 6 Months (potential for extension)
Location: Toronto,ON
Compensation Range: 40$/hr(T4) to 60$/hr. (INC)
Our client, a multinational technology company specializing in information technology services and consulting, is looking to hire a Site Reliability Engineering (SRE)
Requirements:
• Monitor and maintain system reliability using tools like DataDog, VictorOps, ELK, Grafana, and Prometheus.
• Ensure uptime and performance by proactively identifying issues and responding to alerts.
• Troubleshoot, investigate and resolve complex technical issues. If required, collaborate with the engineering team for timely issue resolution.
• Handle production incidents by analyzing root causes, prioritizing resolution, escalating as needed, and adhering to defined SLAs, SLIs, and SLOs.
• Develop and implement automation scripts (Python or other scripting languages) to streamline operational tasks, improve system efficiencies, and reduce manual workload.
• Manage and maintain infrastructure across AWS environments.
• Implement best practices to ensure optimal performance, reliability, and security of cloud-based applications.
• Work closely with development, QA, and operations teams to drive continuous improvement and foster a culture of reliability.
• Manage requests and incidents through JIRA and ServiceNow, documenting troubleshooting procedures, solutions, and lessons learned for continuous improvement.
• In executing SRE activities, the assigned engineers need to use customer provided HW
All interested applicants who meet the qualifications listed above are invited to submit a resume by clicking "Apply Now".