The company is a global technology company, provides shipping & mailing solutions powering billions of physical and digital transactions in the connected and borderless world of commerce. Helping clients achieve their greatest commerce potential are 12,000+ passionate employees around the world, our relentless pursuit of innovation with over 2,300 active patents, and our focus on clients, who are at the center of all that we do - from small businesses to 90% of the Fortune 500.
maintains mission critical enterprise applications. You will be involved in ensuring the availability of the Products and the infrastructure on which they are hosted. You will have access to various monitoring and troubleshooting tools that you would be using to do an investigation and resolve issues. You would also be responsible for doing a Root Cause Analysis and take necessary Corrective Actions to prevent reoccurrence of issues.
• This role is very similar to a DevOps role (lean version of DevOps)
• Communicates effectively on risks, issues, and changes associated to the product monitoring with Manager and team.
• Work with team to identify what monitors can be created and how. Ensuring that application monitoring is done in the best possible way and alerts, monitors are set right.
• Create and update monitors for products.
• Be part of development scrums to identify what all features are being developed and be ready with monitors – infra and application both well before launch.
• Maintains documentation of the systems and provides reporting that ensures proper tracking and visibility of issues and projects.
• Find ways to optimize, use technology to automate monitoring.
• Identify training needs and upskill to support portfolio.
• Strong hands on experience on various AWS services
• You will design and implement monitoring solutions using advanced tools like Sumologic, Dynatrace , Splunk, Cloudwatch
• Execute technical operations around monitoring for - Automation, Network, Cloud, Reliability Engineering
• Support the operational environment that provides the company client facing applications and meets the requirements of service availability and recoverability
• Organize and facilitate outage event management activities including outage status communication, problem determination activities and root cause analysis
• Demonstrate Ownership and accountability
• Graduate or Post-Graduate (preferably in computer science or related course)
• 4-5 years relevant experience in Product/Application Support L2 level.
• Excellent written and verbal communication, time management, and presentation skills
• Experience of SaaS based product support.
• Possesses high level understanding in the areas of APIs, Databases, Systems Architecture and Design
• Experience in infrastructure, security and compliance
• Strong experience with AWS, and experience in cloud and network administration
• Good to have infrastructure as code experience (Ansible, Terraform, CloudFormation)
• Experience in scripting languages (shell scripts, Perl, Python)
• Experience with Dynatrace , Splunk, SumoLogic, AppDynamics, Prometheus, Grafana and/or monitoring/logging solutions
• Platform s- Windows, Linux
• Strong cross-functional collaboration skills, relationship building skills, and ability to achieve results without direct reporting relationships
• Strong sense of personal responsibility and accountability for delivering high quality work, both personally and at a team level