When GUIDs Collide

What happens when 2 identifiers or keys collide?  Bad stuff.  Databases originally used integers for their record identifiers.  If you wanted to add a new record you must query or ask database what was the last integer and add one to it.

This is fine in isolation, but what happens if you want to insert a bunch of records into a remote database that you can only connect to once per hour?  Oh and there are multiple users.  You can’t use Integers anymore you must use something like a UUID/GUID.

Globals Unique Identifiers (GUID) or Universally Unique Identifiers provide a “random” 128bit value that looks like the following:

https://sessionlink.herokuapp.com/?id=c42db039-ba38-4ce8-a5c0-c997894bcade

They work great, and once you switch from Integers to GUIDs your database is truly distributed and can be updated on the moon!  They are a bit ugly though.  Also they take a huge amount of storage compared to the 64 bits of an integer.  Data size matters sometimes.  Wait, GUIDs are 128 bits!

But still we have a slight probability of generating or guessing the same ID with the birthday problem.

Screen Shot 2019-10-27 at 6.42.43 AM.png

If you are FAANG then you have to start worrying about stuff like this.  There will be collisions.  What happens if you add another GUID or two, increasing the bits to 512?

So there is still the GUID Soup to consider…  Start putting multiple GUIDs in an URL and it is pretty gross.

Enter CUID, which has some interesting objectives:

Collision-resistant ids optimized for horizontal scaling and binary search lookup performance.

Screen Shot 2019-10-27 at 6.29.57 AM.png

How do they do this?  with a string 25 characters.  Timestamping is critical to its binary search capabilities.  Ordering by timestamp allows the lookup to be logarithmic.  So if you need to find one record in a billion.  It would take about 30 tries to find it.  Sorting is key to speed here.

If you need speed and no creation time privacy CUIDs may be a viable option.  Anonymous systems must never use CUIDs.

Only slightly larger in storage size than a GUID… 25 Characters * 8 Bits/Character = 200 Bits.  Guessing CUIDs I feel would be easier…

Screen Shot 2019-10-27 at 6.50.42 AM.png

I’ll stick with my GUIDs soup for now.

Until Next Time. Happy Merging Distributed Datasets,

Rob

 

P.S. Microsoft trying to brand it, er recapitulate:

Screen Shot 2019-10-27 at 7.12.13 AM.png

Duck Typing

If it walks like a duck and talks like a duck it must be a duck.

Recently I started a project with Pure “Vanilla” Javascript, and it was quick going.  With the React as the starter pack, I had a basic prototype with proof of concept multi-user editing: https://twitter.com/robmurrer/status/1169118440122015744

That was a month ago and I ported it to Typescript which is a superset of Javascript (JS) that “compiles” to JS.  It plays well with React’s JSX becoming TSX.  Tip to cast in Typescript:

 blockchain = await this.state.store.getItem(id) as BlockProps;

To me, having to do the work of a compiler, is maddening and that is why I actually really love Typescript (TS).  Coming from any traditional language it is a nice safety cushion under you.

TS has excellent tooling to enable confident refactors, which is the number one issue with my JS codebase.  Refactoring “blind” in pure JS is nightmare.  Refactoring is by far the most important task, so I’ve gone full TS.  But I can see if you are just learning JS you should absolutely not start with TS.

Start pure Vanilla JS and build something. Then when you get stuck trying to refactor.  Revert and start fresh with TS.  I like the quick rewrite and throwaway approach for rapid prototyping.  JS is the best rapid prototyping environment I’ve ever seen.

Ok on to code… The most important part of an editor is the data the user enters.  You cannot lose it.  We are using the Local Forage package from the folks at Mozilla for the data storage layer.  Under the hood it will use IndexedDB on modern browsers.

Screen Shot 2019-10-21 at 6.59.49 AM.png

Ok so what are we looking at here?  The genesis for the post.  What the heck is Hydrate and Dehydrate?  It’s just the cool term for serialize and deserialize.  I like it. The whole point of these names is to communicate what it is they do.

Update:  Um it writes the data in a format it can send over wire or store in database.

But there is something about naming it what it has been called in the past.  Hence this post.

The tour de force of programming design patterns:

  • Finite State Machine (FSM)
    • React’s fundamental concept
    • Start here!
  • Command Pattern
    • Ahem…
  • Classes
    • Only to hold state!!!
    • Of course React uses inheritance 😀
  • Composite
    • Just a tree, or
    • Now with more Blockchain

I’ll write/link up some more about the above at some point.

Until next time, Happy Coding!

 

Screen Shot 2019-10-21 at 7.38.09 AM

Coded into a Corner

Creating software is easy, creating software that is sustainable to maintain and extend is hard.  There are not many mediums in which the consequences of a simple change or design decision can render the complete system useless.  The way in which the software is built is just as critical as to who is building it.

Honey why is the toilet flushing now when I flip the light switch?

Whoa! Cool! I didn’t realize changing the light bulb would do that.

Poorly designed systems are interconnected in ways that are not necessary.  It is often easier to write the prototype in this manner, but products often need to be refactored to use time tested design patterns.

If you ever find yourself scared to change something or can’t figure out how to add the next killer feature.  It is certainly time you reevaluate what you have and don’t be afraid to delete your code.   You may be surprised how much of it is no longer needed.

More code, more bugs!

Recommended Reading

61iOSX4WNiL._SX404_BO1,204,203,200_.jpg

 

The Weekend Thrash

So you think you are almost there. You think one or two more hacking sessions could “do it” and you start the weekend thrash.

What is the difference between a clever hack and an unmaintainable pit of tar?

If you are in the Zone, then you of course nothing can stop you.  But I say stop and rethink your life.  You probably are building the wrong thing anyway.

I don’t roll on Shabbos!

 

walter-sobchak.jpg

 

Blaming the User

My favorite super-pattern in developing and supporting software.  Always assume the user is wrong.  Remember, you are User Number Zero, so… blame yourself.

Screen Shot 2018-07-08 at 8.51.40 AM.png

Defensive Programming is a technique similar to Defensive Driving. Always assume the user will break your program. Defensive programming will save you time in the long run by reducing support and increasing user satisfaction.

Its aim is to reduce the risk of collision by anticipating dangerous situations, despite adverse conditions or the mistakes of others – Defensive Driving – Wikipedia 2018

We must have a balance of making programs flexible enough to allow the user to use it for intentions unforeseen by the tool-makers themselves, and to ensure they don’t fall off the rails and lose data because they entered a word instead of a number.

Input Validation

  • Never trust Input Text Boxes
    • Always convert numerical values using ParseFloat or equivalent
    • Never  Eval() or Shell() with input string
    • Never use string to build SQL Query without proper escaping. see SQL Injection
    • Always trim whitespace at beginning and end of string
  • Give clear and actionable error messages when providing feedback to user on their incorrect inputs
  • … This post could go on for a while, TBF

 

Further Reading

Shipping a Beta

So you have done a demo of your application and your users want to use it for themselves.  How ready is your project to move from barely working prototype to an actual product?  Demos are deceiving.  Remember you only take a picture from the pretty side.

To be able to ship the prototype to beta is a huge step and sometimes can take too long.  Every day that goes by that the product isn’t in a user’s hands is another day without oxygen.  You are swimming further and further from the line.  Toe the line and ship that beta no matter what.

Some Questions to Help You Get Started

  • How does the user install or get your software?
  • Do you have a quick start 1 page tutorial or guide on basic usage?
  • Does it work in your user’s environments?
  • Does it really solve the users issue?
  • How often do they use it?
  • How fast can you ship another beta to fill all the holes they are sure to run into eventually?

Demo or Die

The amount of information conveyed with a picture is said to be worth a thousand words. What is the information density of a demo?  Pictures, flow arrows, and hand waiving can only get you so far.  Demo or die, or sometimes… demo and die.

5b6af7b0-efd9-4d94-b7d7-3319270f2662.jpg

Don’t you love it when someone in audience gets smart and tries to move you off script… the movie magic ends real quick.

There is something about the medium in which you can poke and paw a slider or a button and have something update in realtime.  It either works or it crashes and burns.  A working demo is the baseline test of competency, but the only problem is can the demo make it to 1.0?  Shipping software is incredibly hard and the history of PowerPoint 1.0 is facinating.

In the early days they had trouble convincing people of what PowerPoint was and what problem it solved.  Their mockups were impeccable.  The amount of planning and discipline of these 1980s software startups is inspiring.  Sweating Bullets is on the must read list for anybody who studies software development.

Suggested Reading

PDF of Sweating Bullets by Robert Gaskins

Screen Shot 2018-05-15 at 9.13.21 PM.png