Senior Research Scientist, AWS Incident Tooling & ResponseAWS Resilience owns service to prevent and respond to availability and security issues for all AWS Services. In other words, we're the people who keep the cloud running. We work on the most challenging problems, with constant new services and possible failure modes to prevent — and we're looking for talented people who want to help.
You'll join a diverse team of software, security experts, operations managers, and other vital roles. You'll collaborate with people across AWS to help us deliver the highest standards for safety, security, and availability. You'll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
AWS Incident Response is at the heart of the high availability of Amazon Web Services. We make customer impacting events shorter and less frequent by driving large scale event and incident response. Our automated tooling quickly identifies the cause of an issue and helps mitigate its impact, and much of our engineer time is spent on projects to improve the tooling and automation. We also provide manual incident management for AWS and other Amazon groups, directing the resolution of an issue with service teams, and diving deep into those events to drive improvements to the tooling. It's an exciting time to join our team as we are growing and expanding our offerings.
Key job responsibilities: You will own the organization strategy relative to the usage of ML, GenAI and propose the best technology to advance our ability to better detect, faster root cause, and correlate to prior incidents to shorten customer facing AWS incidents.Your work will enable us to identify gaps in our current strategy, learnings from past incidents.You will contribute to shortening incident response through deep analysis and introduction of new technology.Minimum Qualifications: Masters degree (or European advanced degree equivalent) or PhD in Computer Science, or related technical, math, economics, or scientific field.Several years of relevant experience in developing large scale machine learning or deep learning models and/or systems in a production environment.Experience in using Python, R or Matlab or other statistical/machine learning software language.Several years experience specifically with deep learning (e.g., CNN, RNN, LSTM, etc.).Experience hiring or mentoring more junior colleagues.Preferred Qualifications: PhD degree in computer science, engineering, mathematics, economics, or related technical/scientific field.Hands on experience building models with deep learning frameworks like PyTorch, or similar.Experience with machine learning, time series, NLP and CV solutions.Proven communication skills, presentation skills, and attention to detail.Comfortable working in a fast paced, highly collaborative, dynamic work environment.Scientific thinking and the ability to invent, a track record of thought leadership and contributions that have advanced the field.Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build.
Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice (https://www.amazon.jobs/en/privacy_page) to know more about how we collect, use and transfer the personal data of our candidates.
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/content/en/how-we-hire/accommodations.
#J-18808-Ljbffr