Get yesterday’s date in a windows batch file – correct version

Hi all,

If you use a windows server and get yesterday’s date for something like a download, you might probably use this popular script that can be found here and there. Unfortunately, the script is somewhat broken when dealing with getting the last day of the month. The logic is solid, but somewhere between the idea and implementation, something was lost.

see below the working, correct version:

 

 

Normalizing codes, mySQL extracting alphanumeric functions

Wanted to share an sql function I end up using quite frequently when dealing with user input data – normalization to alphanumeric. It can also be modified into a function to extract numeric values. If you do modify it to that, make sure you keep returns varchar for the cases of extracting numeric values beginning with 0. You can also add other characters, such as the minus sign, if required.

Example use cases:

Matching producer codes:

Normalizing/matching  phone numbers:

 

function:

 

Cleaning noisy data: Postgresql Mode and Modal Share aggregate function

Calculating mode and modal share on postgresql has never been easier than just running these functions.

Some background:
I needed to calculate the aggregate mode and the frequency of that mode to be able to draw some conclusions about the stability of some particularly noisy data. Sample standard deviation can be very misleading for measuring data noise, as it’s a mix of amplitude of variation and frequency.

Imagine measuring the best temperature to drink wine based on what people would report.

Some would say for a particular wine the temperature would be 10 celsius, while many others will say 11-12 degrees.  As we are talking about user input, someone may jokingly recommend drinking it at -20 Celsius or at 120 Celsius. The data cannot be trusted.

So let’s take the array recommended_temperature [8,10,10,12,11,10,10,12,10,11,-20,200,10,10]

Finding the right temperature to serve the wine at from this array can be tricky. Data is noisy and not to be trusted.

Total Numbers: 14

Mean (Average): 21.71429

Standard deviation:51.95624

Variance(Standard deviation):2699.45055

Do any of the above numbers even come close? No. You would not serve the wine at an average of 22 Celsius, and the standard deviation is high, but that doesn’t tell you if the values are spread, or if an outlier is messing everything up.

What about the mode? Mode in our case is 10 (the most common value)
Sounds reasonable, wine tastes nice at 10 degrees. But how sure are we that this is the right temperature?
We need more info, mainly modal share (number of values in the mode/all values)

For that, I put together in PostgreSQL a mode_count function, to allow you to find how many values you have in the mode, and possibly calculate a % of all values.
so now we have Mode:10;Modal_share 0.5 (7/14)
We can say with 50% certainty that the appropriate temperature to serve the wine at, based on user recommendations, is 10. Add in the high standard deviation, and you will easily detect the fact that there are strong outliers present in the data, giving you even more confidence towards your mode value.

The function from postgres manual for mode: 

and the custom mode count function:

They are aggregate functions, so call them as:

Hope you enjoy!