Learnings from my first 1.5 years freelancing in data

0. Ground Zero:  Learning the Mindset

 Your cost of opportunity is an employed job.

Your worst case scenario is getting a job. Basically, if you fail, you’re still ending up in a pretty decent place. Such, it’s worth taking a risk and trying freelancing if you think it might be for you.

 

Do it now, not later.

From my observation of people saying the above, if you do it later, you have <20% conversion rate in the first year.  If you do it now, you have 100%. Stop delaying. Your cost of opportunity as a freelancer is your hourly rate, which would probably translate into 2x your salary (unless you make 200k).

Once you have become a freelancer, you will look back and ask “was it worth paying 5K/month to stay in that job instead of having extra freedom and responsibility?” The answer will be a definite no.

I know it’s hard to see it that way without living it first, but try to put yourself in the freelancer’s shoes.

 

Value:

Don’t doubt your value or have impostor syndrome. Good workers are worth their weight in gold. A typical freelancer can be much cheaper than employees because he’s bringing in the right skills at the right time, slashing through company’s cost of delay. Also, many freelancers become better professionals due to the various exposure, and they end up being worth it even as salaried employees at those rates, because typically the top 1% of the workforce in salary has very different salaries anyway. Value based pricing in data, in the context of organisational change that brings a 20-40% revenue growth, would be worth millions for many companies. Your expertise is key in making that happen and your hourly rate is peanuts in comparison. Keep in mind that world – scale asian companies are sometimes paying millions per year as a salary for the data scientist they want – all is good as long as the ROI is fantastic, and it typically is for those companies with the scale to make it matter.

 

How to start:

Starting well is key to the first year survival. If you want to keep your transition to freelancing easy and without risk for  large gaps in your monthly pay,  consider the following tips:

1. Starting: Be able to take a freelance job.

The vast majority of freelancing gigs get filled relatively fast (<1 month), because freelancers tend to have short projects and are able to take on work more often. Such, it is ideal that you are able to start work on a new project on short notice, ideally <2-4 weeks.
There’s a few ways to be in that situation:

a) Be in an easy to leave job

If you just started a new job, are in probation period and it turns our to be different than what you wanted, you can look for a freelance project and jump ship quickly. If you’re stale in a job with 3 months quitting time, quit, start looking for a project and if you cannot find one and cannot afford the unemployed time, just take another job (maybe look in parallel).

b) Get fired.

Myself, I was in a situation where I could not work because the futility of the work was so clear, and the requirements so hazy, that I was unable to produce anything that wasn’t nonsense. I had a boss that dictated ‘Sql Only’ because that was the best he could do. He personally checked the code on pull request of each of his data team. He re-wrote everything himself, and only left the bugs to the others to worry about. He believed he invented ‘truncsert’ (truncate table and insert from a view in the db), a form of non incremental data loading where you still have to maintain ddls in database and deal with migrations (the worst of both worlds). I was in a team mixed  competences including brown nosers that kept their mouth shut and collected bonuses and promotions. I wouldn’t be a part of that and I got fired. Getting fired gave me 2 ‘paid’ leave months. It gave me 1 month to recover from the experience, after which I started looking for freelance projects. My girlfriend at the time (now my wife) supported me to stick to what is important and not jump into another job. Two months from being fired, I was already starting my first project.

I said ‘paid’ in quotes when referring to the unemployment, because in hindsight the unemployment help was too small to be relevant, and it was rather a mental comfort more than any significant financial contribution. Frankly, if you can afford it, don’t wait to get fired. Just quit and save yourself the hassle. In the long run, your cost of waiting is higher than the unemployment benefits.

c) Not have a job in the first place.

I have met quite a few freelancers that started as freelancers during university and never took a employed job, and have been freelancing since (years). Wherever you are in your life, if you are not currently employed, that’s already a step in the right direction. Get a project instead.

2. Find a freelance job.

Agencies and freelancers are your best friends. Connect on Linkedin to agencies (try big ones like Hays or DIS AG for Germany rather than those typical recruiters), and they will let you know every time a large client is looking for work. Work with agencies that specialize in freelancing, the others are optimized for jobs and don’t have freelance opportunities (or get them late). Ideally, do not work with amateurs – If they are flaky or scammy, avoid them at all cost.

Freelancers are best met face to face. Look them up on Linkedin and ask them for a breakfast/lunch/dinner. Be prepared to travel – it’s you that wants something. Form a network, get remembered, and you will have work coming in. Not just data freelancers, but any freelancers from intersecting domains. Data? Don’t forget tech, finance and marketing freelancers.

Hiring managers are also a typical customer. CTOs, founders, team leads. These guys sometimes need something done well from the first try, so they bring in expertise to build and hire. Sometimes, they just want a replacement to fill a gap left by someone that changed position, quit or went on parental leave/sabbaticals.

Be flexible with your requirements and don’t be too fussy as long as it’s in your field of expertise/specialty. It’s best to start off OK rather than not start off at all. Even OK freelancing is better than employment.

3. Do a freelance job.

Once you found a job, interview as usual, come to work as usual. You’re basically an employee, so you are expected to perform at the level of seniority you were hired for. Whatever you do, you’ll be fine as long as you have been honest about your ability.

Get your accounting sorted. If you’re into this stuff, do it yourself, else, an accountant is TOTALLY worth it. Your time and peace of mind for correct accounting is worth more than the accounting fees, and good advice will bring you further financially. A skilled tax accountant will often save you significantly more than you pay them. If you’re a Berliner, I recommend Suat Göydeniz or Firma.de.  Definitely find someone you are comfortable with and that answers your questions quickly.

My original choice for accounting was felix1, which turned out to be horrible. Felix1 overall took me 5 months to get my money back from, only after reaching out to a C level of theirs on linkedin to ask if these are their business practices or if I fell through the gaps… repeatedly. They were unable to start performing the service I paid for 2 months after first call and payment. I almost lost a client due to them. Don’t compromise on a decent accountant.

Set aside money for taxes. Find some way to approximate how much income tax you will have to pay in real-time (I use a google sheet with the brackets in a formula) and make sure you can afford the year end or tax pre payments. As a rule of thumb, if you save 50% of what you bill for taxes and health insurance, you should be fine.

Learning 1:

There is always more work. It’s like in a job – every day there’s something that needs done. Finished your main project goal? Your client will often find more uses for you.

If you’re hired by a team, they will give you as much work or nice to have things to do as they have budget for. Why not? It’s always nice to have a senior helping hand.

If you’re hired by a business owner, you’re typically in for a lasting relationship as long as your ROI is positive. In data speak, this is either when the client is big enough that your output optimizations are  worth your pay, or until they get data-mature to the point where most things are already optimized.

Learning 2: Vacation is different.

You know the joke about freelancers not having employer paid medical leave? “Freelancers don’t get sick.”
Of course you do, cos you’re human. And it’s OK to have unpaid medical leave because you deserve the best care you can afford: you are your best workhorse and it is only through good health that you are able to perform. Don’t mistreat yourself for money. Besides, you don’t have to feel bad about missing work – your client is cost- free for this time, so it’s even more relaxing than paid sick leave.

Sick leave aside, don’t end up calculating your vacation’s cost of opportunity. 40 hours/week *100/h*0.65 net rate = 2600 in your pocket and that’s not actually relevant to you enjoying your life. Remember why you are freelancing and imagine what happens if you advance 5 years into the future and you’ve done nothing else in your life but work. Don’t forget, your goal is to live the way you want, not to earn bank points. Take a week off if you feel the need to connect with your inner self, learn something new, spend time with your loved ones, follow a hobby, find divinity in the ordinary or go fishing.

You’re the youngest you’ll ever be, right now. Don’t sell that.

Learning 3: How to fail.

Freelancing and entrepreneurship are not the same.

Ask any data (or tech) freelancer if they would ever take a full time job again. They will tell you Definitely NOT! Whatever reasons they have (family, time, money, learning, culture, life), taking a job would be detrimental in almost every way. You’d halve your pay, let your skills get rusty, let company politics get to you, get stuck with all the poop, miss out on vacations and life, surrender control over your time, etc etc. Same recipe, different taste.

Except…

Except for the ones that are struggling or who struggled and sank. How do you sink? From my observations there’s two easy ways:

a. Charge too little.

Discount 50% for remote. Don’t charge toilet breaks. Don’t charge travel expenses. Don’t charge client calls. Don’t charge travel time for in person meetings.

This is a surefire way to make your effective rate lower than what you’s make in a job, but with much more extra hassle and stress. Don’t do this to yourself. Keep your opportunity cost in mind.

b. Building a product upfront.

If you believe you can sell  product, don’t build it upfront. Find a client and build it for them, while charging them for the work.

This way, your worst case scenario is if you cannot find a client. Such, it’s better to not find a client and lose the time spent looking for a client, than to not find a client and lose also the product building time. Your best case scenario? Still better at the end of the build with a client, since you have your payments early.

Learning 4. About going to Asia

I have heard from 2 clients so far that they had another freelancer but they went to Asia (Thailand, Vietnam) and never came back. You can imagine my deja vu the second time I heard this. When I told this story, I heard another about a freelancer (not data this time) that went to Vietnam while working remotely, and decided to stay and open a bar. Apologies if I am getting any of the 3 mixed up.

Regardless, you can go to Asia, but keep in mind rates are lower – so if you’re gonna take a pay cut, rather take a vacation and enjoy travelling.

Go to Asia, or wherever you wanna travel to, and enjoy.

Learning 5: The ideal client

The ideal client is a different one depending on what you are trying to achieve. In a perfect world, the perfect client is someone who is not price sensitive (you don’t want to argue about discounting a train ride to the office), who always buys your work but at the same time doesn’t depend on you.

However, while you’re not in control of what clients exist on the market, you are in control of how you develop your relationship.

Maintain a healthy long term relationship, without becoming a crutch for your client. You want a client with whom you are honest, and consult them on what’s best for them, even if it isn’t in your direct interest.

A smart client can understand the value you are able to bring, and will always have a place for you, or a recommendation. A successful freelancer is someone who is competitive in the workforce (read: a professional that doesn’t stop learning), and such workers are often rare and add more value than the average worker. Such, a smart client will have no issue paying for a freelancer rate either until staffing, or for the duration of a project (and often after if budget is there). Of course, know your strengths and weaknesses and tackle the projects you can take with your client, and don’t jump on things you cannot do well.

A smart freelancer will manage expectation, to keep their client informed and able to take good decisions regarding ROI.

Learning 6:  Work ethics

Having met various freelancers, I can confidently say most of them are nice and have good work ethics. In fact, it’s more common that freelancers will bill a little less fussy about the details (read: charge less than they should).

However, there are bad apples out there. Every forest has its stumps. Don’t be one of them.

Your work ethic should be solid. Be always honest with the client regarding your relationship, bill fairly, and don’t double book. Why? Because without values, all you have is a racket.

What if you are reselling code that took you hours to do the first time around? There’s no rule for this. I have seen the following:

a) Bill nothing extra, just bill for the time.

I am currently doing this but am having second thoughts, as I feel it’s counter productive to bill 2h for implementing something that might take a week to build.

b) Bill implementation + fraction (up to full) of the hours it took to first build.

Agencies do this. They fill in a config and suddenly hundreds of lines of code are  hard at work in your codebase.

c) Fixed price project

Here, I would stay away unless you know your client – you might end up with scope creep and those 50h budgeted that were supposed to be 20 for implementation  suddenly become 100.

Learning 7: Hourly or daily? Fixed price?

I bill hourly, rounded down to the nearest 10min increment, with the exception of short projects where I round up to the hour because of overhead and cost of switching.

I would advise to go for hourly billing. Why? Because sometimes work days are over 8h. More often, the work is finished after 7:30h for the day, and warming a seat for 30min is counter productive for both you and your client – so just call it a day then. I found I work an average just under 8h, if I there’s no pressure,  and just under 9h on the heavy load days/weeks. Another freelancer recently told me they switched from daily to hourly, and is now billing 20% more. Others told me how important it is to be able to come in 2h late when you have some other life appointments, without worrying how to make up for the time. Hourly = bill what’s fair.

The pros for daily rate is that depending on your client, you might get peppered with small requests. Such, charging half day or a full day for something that takes 10min but blocks you form other stuff is fair – but you can do this with hourly too, as long as you are upfront about it. Or, you could charge small clients on a daily rate and large ones hourly.

Fixed price? I feel this is a gamble. Usually those that advocate for fixed price, inflate the average price by something like 5x to give themselves plenty of wiggle room and high profits if accepted. Otherwise, they would be doing hourly and save themselves the headaches of scoping.  As a client, you should also be aware that any time estimate you receive does not take into account blockers on your side, since they are unknown to the external.

 

The data flow breakdown

Data FlowFor easy reference, I’d like to define the typical data flow encountered in business intelligence by its steps: Data integration, dimensional modelling, and data consumption. Basically these are the typical flows that make first and third party data available for analysis. Most of all it’s this data flow that brings a ROI when investing into BI.

1. Data flow part one: Data Integration

Integration measures how many of the first and third party data sources have been unified in a single place. This single place is usually called an integration layer. It can be any easy-to-access data store. Typically it would take the form of files on a network location (FTP, S3, GCS) or a relational database.

Usually this part is pretty messy in that the external sources are diverse and different. Often, each source requires its own access method to extract relevant data. As a result you get a very messy environment. Ideally handle it by a widely used scripting language to ‘glue’ systems together. It is a good practice here to use some form of utils to dry your code.Using common utils for loading data sources will prove verz time/saving when loading data.

2. Data flow part two: Dimensional Modelling

Dimensional modelling as a step in the data flow is the transformation of integration data to a format suitable for analysis. Usually in the form of a snowflake schema. The goal of this is to arrange the dimensions and fact tables such that metrics are unified.  Thus they are the same across reports.

Some small reporting set-ups omit dimensional modelling, but it is useful in enabling self service reporting.

Ultimately the end goal of dimensional modelling is to provide fact and dimension tables containing correct data. Later on, you will use this data for reporting and analysis. Ideally locate it in a SQL database for easy access.

3. Data flow part three: Data consumption

Here, I want to avoid using the terms reporting or analysis. In my opinion reporting is just one of many channels that a data consumer or analyst uses to access data. Such data consumption can take the form of performing an analysis that requires ad hoc query writing and statistical analysis. Or say the analyst has to perform a  seasonal product analysis based on data from a product performance dashboard.

The end goal of data consumption is to boost business metrics through informed decision making. Consequently this is where investing in data actually pays off, so make sure the needs of the decision-makers are met.

Python and SQL open source business intelligence stack

I’d like to cover here some basics about the common business intelligence stack of Python and SQL. Let’s see why this stack is so popular, and why it’s not yet the one and only gold standard.

Python Logo & SQL Logo

Why Python for business intelligence?

Readability.

Most business intelligence professionals using Python come from all walks of life and do not have formal programming education. This can lead to incredibly random ways of coding – not for Python though, as Python requires indentation.
This doesn’t mean that someone who has no idea will produce good code or a good architecture. And regardless of how clean the code, people can do very obscure and roundabout complex things. What it does mean that it is very easy to read their code, which is the first step in understanding it.

Staffing.

Python is an easy to learn, widespread, and easy to use language. It’s much easier to find Python developers than developers with VBA knowledge. Additionally, it’s even more difficult to find developers with BI knowledge and knowledge in other specific languages.

Libraries.

For BI work, you need to extract/load data, transform data and analyze data. Python can do all – most advertising  data producers offer Python libraries to access their APIs. Python can access any database you can think of. Python natively has great features for data munging. Python can be used to perform large parallel computations or statistical analysis. Data science too.

Widespread usage/versatility.

This means that whatever problem you are encountering, someone else probably encountered it too. You will have a very easy time to find solutions or get help for anything Python. While other languages might be better at specific things, Python is a jack of all trades, master of some.

Why not Python?

Spoon-fed.

Well, many BI developers stop there because the language is so versatile. However, this means they will not have the opportunity to work with languages designed for large software projects. Also they will miss out on best practices and programming paradigms. Too easy sometimes means people stop learning.

Slow.

Python is relatively slower than most other languages. In the context of working with data, this only appears as an issue when doing complex calculations on very large data sets.

Why SQL for business intelligence?

I will talk here about PostgreSQL for small data specifically, since it is the best open source database for analysis. It has better functionality that paid solutions like Microsoft’s, and equal or better performance and functionality to Oracle’s product. For free, without licensing fees or vendor locks, it is built by developers, for developers, with robustness and ease of use in mind. You can even run Python on Postgres 🙂

For big data, the leading SQL solution seems to be the postgres-like product from Amazon, Redshift.

Ease of access:

For the sake of easy data access, SQL makes a lot of sense. There are plenty of tools that can run SQL against a database to generate dashboards or can connect it easily in excel. And it is easy to learn for analysts.

Ease of data manipulation:

Most of the data you will have will not be single numbers or random sentences, but structured in tables. This means that instead of conventional algebra, we use relational algebra, to perform operations between data sets instead of single values. This is where SQL shines, since it is designed for these kinds of operations specifically. Once you have the data in a database, you can leverage the power of this language to very easily perform these operations.

Staffing:

Every self-respecting BI professional known at least a little SQL. Most good BI engineers know a lot of SQL. Also, particular type of SQL is not very relevant, as the flavors are very similar.

Why not SQL?

Poor options for architecture.

SQL is a query language, not a programming language. It’s meant to be used to rearrange and compute data, and not to do complex operations. You end up producing a lot of code, often wet (opposite of DRY). The code ends up being monolithic, and it is not easy to see at a glance what a query does. Basically, you cannot write easily maintainable code.

Lowers the bar.

SQL is so easy to learn, that a lot of people end up being able to use it. However, this does not suffice for writing clean code, designing an ETL, a BI application, or a data warehouse. I’ve personally seen ridiculous projects that were a dependency hell of views on views on views . Sadly, knowing SQL is not enough to say about one’s ability as a professional to deliver a quality product, but the two often get confused in the world of BI.

I hope you enjoyed this article, and if you have any opinions on the topic do not hesitate to leave a comment.

Twitter ads API: gotchas when downloading report

I recently built an ETL to download ad reports from the Twitter ads API.

If you are about to do the same, take note of the following gotcha.

Include historic data in your download

First, you should download at least 2+ weeks of history with every new daily load. The reason behind this? Twitter ads data will be updated historically by Twitter as they work out the actual numbers (removing bots, spam etc). This is common behavior for many ad APIs.

Provide datetime

Second, because of their API design, you NEED to provide start datetime and end datetime for whole days. Notice it requires datetime, and not date, even though you clearly request daily aggregation! Additionally, you need to provide the midnight of your account’s timezone in UTC time, or the API call will return an error. (Don’t ask me why you handle this on your end when the timezone info is coming from the API…)

You can retrieve the timezone of each account from the endpoint '/2/accounts/your_account_id', the body['data']['timezone'] of the response. Then you calculate your start and end date for your report request, and adjust both of them to the right UTC offset based on the account timezone. You can achieve this easily in Python with pytz module. Please note that if your start-end interval contains a daylight savings time change, you cannot apply the same offset to both dates, as offset from UTC changes. The best approach is to calculate this offset based on the timezone for each of the start and end dates.

Keep watching for gotchas in the Twitter ads API

There are a few other gotchas/undocumented limitations in their API, but you will find it easier to address those correctly. (Example: only request stats for 10 tweet_ids at a time).

Yes, it stinks, but at least Twitter ads API is still better than Bing ads API.

Good luck!

MySQL Federated database creation

MySQL does not support entire federated databases, but it does support federated tables  – as long as the option is enabled in my.ini or my.cnf.

To create an entire federated DB, in a non manual, and easy to maintain way (refresh table definition if source table structure changes), we can use the following procedure.

First, the caveats:

1. Federated tables do not use indices in the usual way, so you may need to create a table locally and index before expecting any performance.

2. When using a client such as Heidi, that does show table status, the connection between your DB and the source DB of the fed table remains open. An issue can arise if the number of tables is significant, while max connections on source db is low. To get around this, periodically use flush tables on the destination DB to close the connections. There are alternatives to this, but this is the simplest workaround (may not be the best).

Personally, to get around this issue I scheduled a MySQL event that flushes every federated table (not FLUSH TABLES, but FLUSH TABLE TABLENAME;FLUSH TABLE TABLENAME2), to avoid blocking other queries. FLUSH TABLES flushes all tables, so it has to wait for other queries to finish, and any new queries have to wait for it to finish, causing no possibility of concurrent queries.

1. Get a nodata mysqldump in the same way as you get a normal dump, but with -d switch for no data

 

2. Convert the engine part of the tables into federated+connection (here done with Python)

 

3. Load up your new dump

 

You can run this periodically or on demand to refresh the table definitions if the source definitions change

MySQL Rank() function

UPDATE: On July 18 2018 the MySql server team announced the introduction of window functions. The article was written before this, offering an workaround to gain some similar functionality. I updated this article now with the latest and best practice way to do this correctly, but sill leave the old article for reference.

AS of Mysql 8, you can now use window functions such as  rownum() OVER(PARTITION BY country) . Interestingly, you can just reference your partition by an alias, and define the partition in a separate window clause section (like where clause), allowing reuse:

 

Happy window aggregating!

OLD ARTICLE FROM 2014 BELOW:

A quick and dirty custom MySQL Rank() function to emulate rank from PostgreSQL into mySQL. Same disclaimers as the custom rownum() function. In this case, I am using it to rank decimals, but feel free to change the input type.

 

to invoke, use rank(column to rank),such as below:

output:

MySQL rownum() row number function

UPDATE: On July 18 2018 the MySql server team announced the introduction of window functions. The article was written before this, offering an workaround to gain some similar functionality. I updated this article now with the latest and best practice way to do this correctly, but sill leave the old article for reference.

AS of Mysql 8, you can now use window functions such as  rownum() OVER(PARTITION BY country) . Interestingly, you can just reference your partition by an alias, and define the partition in a separate window clause section (like where clause), allowing reuse:

 

Happy window aggregating!

 

OLD ARTICLE FROM 2014 BELOW:

MySQL does not have a built in row number function, which makes it an ugly hack to get the row number –  an ugly hack that needs to be repeated over and over, every time you need it.

I have taken the liberty of writing a rownum() function, that is just as bad, but at least it will keep your code clean.

DISCLAIMER:

  1. does not work on older versions of mySQL
  2. It is based on using variables that only reset when time changes. If the function is invoked in 2 separate places in the same microsecond(1 millionth of a second), the output will not be as expected. The variable will not reset, the output will not start counting from 1.
  3. If you run the function more than once in the same query, it will increment every time it is run, so possibly multiple times per row.

For the purposes of BI, I find the associated risks, when used as intended, non existent.
For older versions of mySQL server, you can try finding another creative way to reset the counter for the row number.

 

Row number function

 

to invoke it, use rownum(), such as below

Good luck!

Get yesterday’s date in a windows batch file – correct version

Hi all,

If you use a windows server and get yesterday’s date for something like a download, you might probably use this popular script that can be found here and there. Unfortunately, the script is somewhat broken when dealing with getting the last day of the month. The logic is solid, but somewhere between the idea and implementation, something was lost.

see below the working, correct version:

 

 

Normalizing codes, mySQL extracting alphanumeric functions

Wanted to share an sql function I end up using quite frequently when dealing with user input data – normalization to alphanumeric. It can also be modified into a function to extract numeric values. If you do modify it to that, make sure you keep returns varchar for the cases of extracting numeric values beginning with 0. You can also add other characters, such as the minus sign, if required.

Example use cases:

Matching producer codes:

Normalizing/matching  phone numbers:

 

function:

 

Cleaning noisy data: Postgresql Mode and Modal Share aggregate function

Calculating mode and modal share on postgresql has never been easier than just running these functions.

Some background:
I needed to calculate the aggregate mode and the frequency of that mode to be able to draw some conclusions about the stability of some particularly noisy data. Sample standard deviation can be very misleading for measuring data noise, as it’s a mix of amplitude of variation and frequency.

Imagine measuring the best temperature to drink wine based on what people would report.

Some would say for a particular wine the temperature would be 10 celsius, while many others will say 11-12 degrees.  As we are talking about user input, someone may jokingly recommend drinking it at -20 Celsius or at 120 Celsius. The data cannot be trusted.

So let’s take the array recommended_temperature [8,10,10,12,11,10,10,12,10,11,-20,200,10,10]

Finding the right temperature to serve the wine at from this array can be tricky. Data is noisy and not to be trusted.

Total Numbers: 14

Mean (Average): 21.71429

Standard deviation:51.95624

Variance(Standard deviation):2699.45055

Do any of the above numbers even come close? No. You would not serve the wine at an average of 22 Celsius, and the standard deviation is high, but that doesn’t tell you if the values are spread, or if an outlier is messing everything up.

What about the mode? Mode in our case is 10 (the most common value)
Sounds reasonable, wine tastes nice at 10 degrees. But how sure are we that this is the right temperature?
We need more info, mainly modal share (number of values in the mode/all values)

For that, I put together in PostgreSQL a mode_count function, to allow you to find how many values you have in the mode, and possibly calculate a % of all values.
so now we have Mode:10;Modal_share 0.5 (7/14)
We can say with 50% certainty that the appropriate temperature to serve the wine at, based on user recommendations, is 10. Add in the high standard deviation, and you will easily detect the fact that there are strong outliers present in the data, giving you even more confidence towards your mode value.

The function from postgres manual for mode: 

and the custom mode count function:

They are aggregate functions, so call them as:

Hope you enjoy!