- System Engineering / Admin
- Frankfurt am Main
- Amazon Web Services (AWS)
- Google Cloud Platform
- Microsoft Azure
- Quality Assurance
- RESTful Web Services
At IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, lets talk.
Your Role and Responsibilities
We are seeking an highly motivated Site Reliability Engineering Professional with know how in the transformation of IT environments towards a NextGen Operating model in the context of hybrid-cloud-Platforms.
The Site Reliability Engineering (SRE) professionals are engineers who specialize in reliability and resiliency with the right mix of knowledge and skills in software and systems, responsible to analyze business needs, problem determination, advise & design, build, test, deploy, changes and maintenance of a well engineered information system and ecosystems.
Who are site reliability engineers and what do they do?
A site reliability engineer is a software developer with IT operations experience – someone who knows how to code, and who also understands how to ‘keep the lights on’ in a large-scale IT environment.
Site reliability engineers spend no more than half their time performing manual IT operations and system administration tasks – analyzing logs, performance tuning, applying patches, testing production environments, responding to incidents, conducting postmortems – and spend the rest of their time developing code that automates those tasks. Their goal is to spend much less time on the former and much more time on the latter over time.
At a higher level, the SRE team serves as a bridge between development teams and operations teams, enabling the development team to bring new software or new features to production as quickly as possible, while also ensuring an agreed-upon acceptable level of IT operations performance and error risk in line with the service level agreements (SLAs) the organization has in place with its customers. Based on their experience and a wealth of operations data, the SRE team helps the development and operations teams establish
- Service level indicators (SLIs): Measurements of the service level provided by systems – metrics such as availability (uptime) or latency
- Service level objectives (SLOs): Agreed-upon means of measuring service level indicators
- Error budgets: The maximum amount of time a system can fail or underperform without violating the contractual terms of the SLA. More than a metric, the error budget is the tool a site reliability engineering team uses to automatically reconcile a company’s pace of innovation with its service reliability.
Required Technical and Professional Expertise
- Years of Experience in Cloud Transformations: > 3 Years
- Ability to design, migrate and modernize cloud-based applications and platformsand workloads
- Extensive Experience with modern tools & concepts for cloud Service, Management and Operation, etc. Github, Gitlab, Bamboo, Jenkins, travis, JSON, YAML, Python, Java, C#
- Background in IT engineering in Networking, Security and Data storage
- Excellent leadership & consulting skills
- Certification as Cloud Solution Architect by AWS, Azure or GCP
- Excellent written and verbal communication (German and English fluent)
Professional knowledge of function, business unit or country operations. Understand organizational resources, priorities, needs and policies.
Guide other professionals. Adapt communications and approaches to conclude negotiations with various partners, resulting in common agreements.
Analyze complex/new situations, anticipate potential problems and future trends, assess opportunities, impacts, and risks. Develop and implement solutions.
Leads multi-functional teams, or conducts special projects, or manages department(s) (national or international). Has vision of functional or unit mission. Influences people and organizations, including executive management, when issues are complex/difficult and require considerable diplomacy. Considerable latitude in responsibilities to define and decide on tools, processes, priorities and resources following general business unit directives. Recognized as an expert in their field. Often no precedent exists.
Impact on Business/Scope:
Accountable for projects or programs involving multi- functional, country-wide or regional teams. Responsible for overall functional program success. Activities are subject to business measurements, impact customer satisfaction, and impact functional, business unit, or country costs or expenses.
About Business Unit
This position currently sits within Global Technology Services (GTS) Infrastructure Services (IS) or a shared services function supporting GTS.
As announced in October 2020, IBM intends that its managed infrastructure services business of the GTS organization will become an independent company named Kyndryl by the end of 2021, creating two distinct and powerful market-leading companies.
Together, we will advance the vital systems that power the digital economy. Serving over 4,600 technology-intensive, highly regulated customers, including over 75% of the Fortune 100, our people will design, run, and manage the most modern and reliable technology infrastructure that the world depends on today.
We will work flexibly and in partnership with our customers to amplify business outcomes while always pushing ourselves to improve and meet all opportunities. Come join our team of diverse, devoted, and empathetic technology experts who are at the center of discovering what’s next.
Please note: The final decision if this position will transition from IBM to Kyndryl is yet to be confirmed.
For additional information about location requirements, please discuss with the recruiter following submission of your application.
Being You @ IBM
IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, pregnancy, disability, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Key Job Details
- City:Multiple Cities
- Category:Site Reliability Engineer
- Required Education:Bachelor's Degree
- Position Type:Professional
- Employment Type:Full-Time
- Contract Type:Regular
- Company:(Y026) IBM Ocean Deutschland GmbH
- Req ID:385909BR
- Travel Required:Up to 25% or 2 days a week (home on weekends- based on project requirements)