Descrição da Vaga
As a Big Data Software developer here, you will join the Innovation team at Linx+Neemu+Chaordic.
The type of problems we solve is both rather deep and diverse: you'll need to know how to store terabytes of data; how to process this data; and finally, how to make it accessible with security and efficiency.
The challenges related to storing are mostly of practical nature: terabytes of data processed daily must be readily available in a high-availability environment, using fast and predictable data retrieval techniques.
To achieve that you will make use of innovative technologies related to distributed computing to handle a gigantic volume of information, in a short period, with the expected results. The most commonly used techniques uses MapReduce frameworks such as Apache Hadoop and Spark, but other tools can be (and are) used in many particular cases.
The challenges related to information accessibility are related to creating the most adequate way of access depending on the application: by means of a high-performing REST API, following a service oriented architecture; by a stream of data for creating real time applications or maybe efficiently stored raw data for batch processing.
- BS, MS, or PhD in Computer Science or equivalent work experience;
- Knowledge of algorithms, data structures and systems design;
- Expert understanding of techniques and best practices to handle extremely large volume of data;
- Strong background in distributed systems;
- Good analytical skills with demonstrated experience turning data into actionable insights;
- Fluency in Java or Scala (or ability to acquire it in short time) is a must. Good knowledge of Python is highly desirable. Experience with Shell Script will earn you extra points;
- Comfortable with small, intense and high-growth start-up environment;
- Experience with large datasets and MapReduce architectures like Hadoop/Hive/Spark is a plus.