Job Title: Dev Ops Engineer
Reports to: Director of Systems
Classification: Staff, Full-Time
Date: February 2017
Purpose of the job:
As a member of the Technology team, the Dev Ops Engineer is responsible for ensuring the smooth operation and stability of Linux-based servers and workstations, applications, and services in a heterogeneous CentOS, Windows 7/10, and OSX environment. Working in concert with the software and pipeline departments the Dev Ops Engineer will support a 24/7 facility with 700+ workstations and servers and a large scale render farm accessing Netapp NAS storage across multiple data centers in a WAN environment for a leading Visual Effects Studio.
- Implement systems that are highly available, scalable, and self-healing
- Liaise with software and pipeline departments to test and implement new initiatives
- Test and modify systems and services to ensure that they operate reliably
- Implement and manage continuous delivery systems and methodologies
- Understand, implement, and automate security controls and governance processes for compliance
- Define and deploy central monitoring, metrics, and logging systems for servers and render nodes
- Design, manage, and maintain tools to automate operational processes
- Create, update and deploy workstation, render node and server OS images
- Create and implement test plans and deployments for major OS version release upgrades
- Maintain and deploy operating system patches and updates
- Troubleshoot, and optimize Linux servers and installed software
- Validate patches, applications, and new software version releases against existing systems
- Maintain and modify scripts and programs to support business processes.
- Document system installations, configurations, policies and procedures.
- Create and update documentation, including maintenance logs and end user training / onboarding
- Perform upgrades and version management of 3rd party applications
- Manage and administer Aspera transfer servers and services
- Manage and administer Confluence documentation system
Education and/or Experience Required:
- Bachelor of Science (4-year) degree in a technical major or equivalent experience required.
- Must be an enthusiastic self starter who works well in a team environment.
- Ability to multitask while managing and prioritizing multiple projects.
- Excellent verbal and written communication.
- Strong interpersonal skills required.
- Ability to quickly assess complex problems and make critical decisions to resolve them.
- Experience with Security Control Audit requirements, file system security
- Experience using LDAP integration with AD in a heterogeneous Windows/Linux environment
- Experience using Puppet, Chef, Vagrant, Ansible for configuration management
- Nagios, ganglia, LogStash, Elastic Search experience
- Familiarity installing and maintain Postgres/Mysql
- Juniper and Brocade management experience
- Experience managing large scale render farms (thousands of nodes) including rapid image deployments, monitoring, and HP ILO / IPMI management
- Experience managing and configuring workstations to be accessed via zero client technologies
- Experience with VMWARE Hypervisors and ESX and Docker deployments
- Familiarity with version control systems: Git, SVN, CVS, and Perforce:
- Programming in Python, PHP, Perl, C++
- Experience in a VFX production environment preferred.
- Occasional after hours and weekend work should be expected in this role to facilitate systems maintenance
Working Conditions and Environment/Physical Demands:
- Office working environment.
- Hours for this position are based on normal working hours but will require extra hours pending production needs.
The above statements are intended to describe the general nature and level of the work being performed by people assigned to this work. This is not an exhaustive list of all duties and responsibilities associated with it. Digital Domain 3.0, Inc management reserves the right to amend and change responsibilities to meet business and organizational needs.