dataDataGoose

Share this post

dataDataGoose Issue #1

www.datadatagoose.com

Discover more from dataDataGoose

Weekly updates and lessons for the Data Scientists of tomorrow.
Continue reading
Sign in

dataDataGoose Issue #1

This is dataDataGoose, a weekly newsletter for Data Science students and hobbyists.

Josh Caulfield
Apr 8, 2023
Share this post

dataDataGoose Issue #1

www.datadatagoose.com
Share

I’m Josh Caulfield, an InfoSec professional. I decided to pursue a part-time bachelor's degree in data science after realising its potential to be the key skill that shapes the upcoming century. Join me on this learning journey — each week I’ll post a new update here (and to your inbox) sharing everything new I’ve learned and read that week.

This week in Data Science

OpenAI, Google, Microsoft and Facebook have been dominating the AI arms race in recent weeks - with each org releasing or announcing new tools, toys and APIs in recent weeks and months. One notable competitor in the advanced AI space, with far less financial might, has been MIDJOURNEY.

For the uninitiated, Midjourney is a small collective of immensely talented researchers building image generation AI, that has succeeded in generating near-photorealistic images that can deceive even the most astute of audiences.

Did this photo fool you the first time you saw it? Credit: Midjourney

One example of just how difficult to discern these generated images can be is the above depiction of Pope Francis dripped out in high-fashion streetwear and jewellery, which sent Twitter users into a frenzy before it was announced widely the image wasn’t real. This confusion marked a key point in AI progress, where even sceptical audiences familiar with AI generated images were duped en masse.

Following this episode, Midjourney chose to remove the ability to generate images for free, requiring all users who want to generate images to be registered and paying members of their community. Read more at FORBES.

Learn more about Midjourney HERE - I especially encourage browsing their showcase section, which highlights some of their recent and highest rated generations, demonstrating the power of this AI tool.

Data Science 101

Often, when dealing with real world data, values fail to align perfectly with a normal distribution. When outliers are present in datasets, the shape of its distribution can distort and skew in the direction of the outliers. Learn about data skewness in this short primer:

dataDataGoose
Understanding Data Skewness
Often, when dealing with real world data, values fail to align perfectly with a normal distribution. When outliers are present in datasets, the shape of its distribution can distort and skew in the direction of the outliers. Skewness is the result of data that is asymmetric about its mean. This statistical phenomenon is most clearly …
Read more
9 months ago · 1 like

Courses & Projects

Just over two weeks remain on the HUMBLE BUNDLE “Pocket Guides 2023“ bundle, boasting sixteen quick-reference guides from O’Reilly Publishing. The books in this bundle are convenient resources to search when you need a quick refresher when writing some code or queries. Many of the titles included here are applicable to data science.

This week’s free resource serves great as either an intro or refresher of common algorithms in python, with a free 5 hour course produced by Real Tough Candy. Understanding common algorithms and being able to implement them is a vital kill in any data scientists repertoire. check out the course on the freeCodeCamp YOUTUBE channel.

Quick Tip

Need some dummy data as a placeholder or for a personal project? Checkout MOCKAROO, a quick and easy way to throw together dummy data inline with your projects requirements and schema.

Casual Corner

This weeks Casual Corner suggestion serves both as superbly informative and engaging entertainment in its own right, but also an A* example in video data visualisation. YouTube channel POLYMATTER produces regular broadcast quality mid-length video essays breaking-down and explaining novel and often lesser-covered topics, typically covering tech and asian politics.

The dataDataGoose Community

I’d be foolish to pretend this inaugural issue of dataDataGoose is being sent to a vast community with its non-existent subscriber base so far, but to the few who may encounter this post and are intrigued by the format, I ask: What would you like to see in future issues of dataDataGoose?

Whether you’re interested in data science professionally or as a hobby, I’d love to hear some of your thoughts on what makes the field so interesting to you.

Thank you for reading dataDataGoose. This post is public so feel free to share it.

Share

Share this post

dataDataGoose Issue #1

www.datadatagoose.com
Share
Previous
Comments
Top
New

No posts

Ready for more?

© 2023 Josh Caulfield
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing