by Leith Akkawi on Nov 20, 2019
Learn Machine Learning using Amazon Web Services
As a data scientist, when assigned a project, your job is to think about data and algorithms. However, to be successful in your career, you also want to think about what clients and employers expect from a data science project.
It does not necessarily matter to your client or employer whether you can solve the problem using 1000 or 10,000 lines of Python code, or whether you prefer to use Decision Trees or Neural Network algorithms.
While it is good to be able to explain your most accurate model, what really matters at the end of the day is whether you solved the problem or not. Once you prove that your solution can solve the problem, that is when you make a difference, and get recognized and rewarded for your time and efforts.
Unfortunately, recent news articles have identified that as much as 85% of projects in the field of Artificial Intelligence have failed to deliver on their promise. If that many projects failed, it would be improper not to question the talent pool of data scientists that are leading these initiatives.
With these unfavorable statistics, it is fair to assume that some data scientists have experienced one or more failures in their careers. Data Scientist by definition have an excellent understanding of the intersection of many complex fields of study. Then how could this failure rate be true?
We must question further, and we must question the knowledge and training of data scientists objectively. This leads us to the root-cause of the issue which is the curriculum at universities and bootcamps of data science programs.
The missing element in the educational content of data science programs is the lack of courses pertaining to cloud-computing. Programs that have incorporated cloud-computing into their curriculum can consider themselves winners!
Academic programs teach data scientist how to solve problems with data that is typically less than 10 gigabytes in size, while considering the theory and practice of Big Data as a separate field of study. But the problems data scientists are working on are on the scale of Petabytes (millions of gigabytes).
The evidence of this theory is proven by the fact that the majority of Fortune 500 companies use Amazon Web Services to store, manage, and process their data on the cloud. As companies advance in their analytics maturity, they begin integrating Artificial Intelligence and Machine Learning by hiring teams of data scientists.
But as described above, when a newly graduate of an academic program is assigned a project and faces the reality that the project is not a .csv/excel file with ten million rows or less of data, the project begins to face challenges.
What does failure look like?
The first line of support for a data scientist is usually to begin calling in experts from the Big Data teams. But that is where the complexity spikes. The data scientists do not understand the work of data engineer’s, and the data engineer’s do not understand the work of the data scientists, and it all gets very messy very quickly.
Thus, it is self-explanatory here that data scientists need to fundamentally understand how to work with Big Data. Big Data is a Data Scientist responsibility! While many Data Scientist have already graduated academic programs in data science and extensive bootcamps, they still need to keep up with market demands and to teach themselves cloud computing.
“The real test is not whether you avoid this failure, because you won’t. It’s whether you let it harden or shame you into inaction, or whether you learn from it; whether you choose to persevere.”
— Barack Obama
Welcome to Amazon Web Services
Fortunately, Amazon Web Services has taken on an intensive Research & Development initiative to develop a portfolio of proven Artificial Intelligence and Machine Learning services that integrate well with Big Data. Additionally, Amazon Web Services has published a variety of certifications on these services. Just like everything else a data scientist had studied, there is a learning curve to learn Amazon Web Services technologies.
The game plan is as follows — First, a data scientist needs to study and learn the introductory level concepts introduced in an entry level certification such as the Amazon Web Services Solution Architecture Associate Certification. Within the Solution Architecture certification, a data scientist will learn the fundamental concepts regarding Identify Access Management, Compute, Databases, Networking, Virtual Private Cloud, High Availability Architectures, Applications, Serverless, and more.
Once a data scientist figures out their way around the cloud platform and can work independently with Big Data as needed, a data scientist can begin to explore Machine Learning on Amazon Web Services. The second and next challenge is to study for the Amazon Web Service Machine Learning Certification. There, a data scientist will discover a remarkable toolbox for Artificial Intelligence and Machine Learning services, including SageMaker.
Note: Without proper instructions and training, it is not recommended that a data scientist just start experimenting with these services with company data and billing accounts. To practice and learn, make sure to secure AWS training credits, because these services can get very expensive very quickly if not managed properly.
Tutorials from A Cloud Guru
These certifications do not have any academic prerequisites. To study and learn, there is a set of excellent online course on A Cloud Guru by Ryan Kroonenburg. The reason why A Cloud Guru is recommended is because Ryan’s tutorials explain technologies as integrated systems, where as other academies focus on each technology separately.
When a data scientist understands how AWS applications work as systems, it is easier to pass a certification and apply this knowledge on the field. When compared to just memorizing last minute cheat sheets and practice tests to pass a certification, a data scientist will not be able to make much use of the knowledge gained without a ton of practice and coaching.
Learning the Amazon Web Services platform will make your job easier and better.
You will be able to solve complex problems and build valuable applications.
You will be recognized and rewarded.
Most clients and employers know that an experienced data scientist makes $120,000/year, and that an experienced AWS Solution Architect Associate also makes $120,000/year. If you prove your expertise, there should be no limitation to making $200,000/year.
By 2020, the data science community can transform its perception from one of failure to one of success! With more data scientist trained on Amazon Web Services, we can expect less failures due to big data management issues. We can also expect more success stories pertaining to the success of implementing projects at lower costs and with more efficient technologies. The data science community can weather this phase better than any other industry, as the community is built on the principles of open-source and collaboration.