Parsing a nested value from a JSON string in .NET Core 3 without needing a DTO

.NET doesn’t use Newtonsoft by default anymore (since .NET Core 3.0). A lot of the examples online show how to use the new System.Text.Json namespace to extract values from JSON strings by deserializing the strings into full POCO classes (DTOs/ViewModels).

If you only want to extract the value of a particular property and don’t want the overhead of having to create a lot of nested classes, its very simple to extract values directly using the JsonDocument class.

For example if I have the following nested JSON structure (taken from Google PageSpeed API in case your curious) ..

PageSpeed nested JSON Structure

and

I want to extract the 0.82 score value I can just create a new instance of a JsonDocument and then chain a couple of GetProperty calls together.

How to read nested values in System.Text.Json

Of course to use GetProperty you need to be sure the property will always exist in your JSON as otherwise an exception will be thrown. If your not sure about this use TryGetProperty instead and check if you’ve successfully got an element before moving on…

TryGetProperty method

Generating sequential GUIDs which sort correctly in SQL Server in .net

Using a non sequential GUID in either a clustered or non clustered index is not ideal from a performance point of view as non sequential GUIDs are not ever-increasing (like an identity int) so get inserted into the middle of the index rather than the end resulting in increased logical fragmentation and decreased performance.

If you really can’t switch from a GUID to an identity int (to save space and fragmentation) and you must use a GUID in an index, to mitigate against the performance problems ensure it is sequential and ever-increasing. SQL Server allows you to do this by setting a default constraint on uniqueidentifier columns which calls the NEWSEQUENTIALID() method. This method creates a GUID that is greater than any GUID previously generated by it on a specified computer since Windows was started. The problem with this method however is it is created on the DB side and won’t be useful if you need to pass in a GUID from the client side.

There are a number of options which can be used to generate sequential GUIDs in C# which are compatible with SQL Servers GUID/UUID sorting mechanism, I’ll cover three of them in this post.

Options for generating SQL Server compatible sequential GUIDs

  1. SequentialGuidValueGenerator which is part of Entity Framework core.
    Note at the time of writing the documentation for this is incorrect. It does not generate sequential Guids using the same algorithim as NEWSEQUENTIALID(), however it does indeed generate sequential GUIDs which are sortable with respect to SQL Servers GUID sorting approach. Looking at the source of the method we can see this method calls the regular Guid.NewGuid() and performs byte shuffling to make it optimised for SQL Server. Usage is simple and of course you don’t need to be actually using EF to reference the assembly and just use the function. The generated GUIDs correctly sort in SQL Server.Generating sequential GUIDs in .net
  2.  UuidCreateSequential with byte shuffling applied.
    NEWSEQUENTIALID() is a wrapper over the Windows UuidCreateSequential function with some byte shuffling applied. We can therefore call this function by importing the relevant .dll and rearrange the bytes in the same way as SQL Server does to get sequential GUIDs in C#. The link above from StackOverflow has all the implementation details. Looking at 20 GUIDs created with this approach we can see they are much more uniform than the Sequential GUID algorithm from EF covered above.
    Depending on your scenario a number of problems with this approach may limit its feasiblity:
    1 – Requires DLLImport which may not be allowed/desired and might have permissions issues.
    2- Not cross platform, this is a windows dll, so if your deploying your .net core app to Linux this isn’t a runner.
    3 – If your windows server restarts your GUIDs may start from a lower range thus causing index fragmentation.
    4 – Can’t be used in cluster environment where mutiple machines write to same DB as GUIDs generated will all be out of synch with each other thus causing index fragmentation.
  3.  COMB GUID.
    The COMB GUID approach was first described by  and involves replacing the portion of a normal GUID that is sorted first with a date/time value. This guarantees (within the precision of the system clock) that values will be sequential, even when the code runs on different machines. There are lots of examples of COMB GUIDs in C# online but I like RT.Comb which is a library available on Nuget. Below you can see how its used in its basic configuration.And we can see the GUIDs it creates are sorted correctly in SQL Server…

but wait… actually they are not sorted correctly.  This is because the timestamp is generated using DateTime.UtcNow which has precision of 1/300th of a second meaning that if you quickly generate a lot of COMB Guids like above there is a chance you’ll create two with the same timestamp value. This won’t result in collisions as the non timestamp bits will take care of that but it means COMBs with the same timestamp are not guaranteed to be sorted correctly for insertion into the DB.

Therefore if you’re inserting records faster than the precision offered by DateTime.UtcNow (very possible) these inserts will cause index fragmentation. Avoiding index fragmentation is the whole purpose of using sequential GUIDs in the first place so we need a solution to this if COMBs are to be viable compared to other approaches.

Thankfully a solution is provided by RT.Comb itself and is part of the reason I like to use this library rather than just some of the code for creating COMBs available on StackOverflow or other places online. RT.Comb has a timestamp provider called UtcNoRepeatTimestampProvider which ensures that the current timestamp is at least Xms (4ms is the default, but this can be changed) greater than the previous one and increments the current one by Xms if it is not. The library documentation gives the following table of what timestamp UtcNoRepeatTimestampProvider (RHS) will provide compared to the original DateTime.UtcNow (LHS) timestamp.

02:08:50.613    02:08:50.613
02:08:50.613    02:08:50.617
02:08:50.613    02:08:50.621
02:08:50.617    02:08:50.625
02:08:50.617    02:08:50.629
02:08:50.617    02:08:50.632

Using UtcNoRepeatTimestampProvider is simple and results in GUIDs which are correctly sorted in SQL Server as shown below.

Creating sequential GUIDs in .net

Which Sequential GUID approach to use?

Of course there are many other solutions on the web which you can choose. All three discussed above are super fast. On my machine I did 100K iterations in at most 101 ms so this wouldn’t influence me either way. Given the potential problems outlined with the dll import approach above I’d avoid this unless there were not viable alternatives, which there are.

There’s not much to choose between the other two. If your using EF Core to update your DB then you’ll have SequentialGuidValueGenerator available anyhow so I’d just go with what you have out of the box rather than integrating RT.Comb for example. COMBs are ideal if you want to be able to extract the timestamp out of the GUID for things like debugging, but it does perhaps have limited real world benefit. There are many COMB libraries available online but if you happen to use both SQL Server and PostgreSQL RT.Comb is ideal as it supports both.

Note Sequential GUIDs by their nature are guessable so don’t use these in a security sensitive context. 

 

Converting numbers to strings without scientific notation in C#

C# will automatically convert numbers which are of type float, double or decimal and have a lot of precision (lots of numbers after the decimal point) to scientific notation. The means if you have a double which for example contains the value .00009 and attempt to convert it to a string C# will display it as 9E-05. Of course this may not always be desired. To ‘fix’ this you just need to explicitly format the string:

double number = .00009;
string defaultNumber = number.ToString(); //9E-05
string numberFromToString = number.ToString("N5"); //0.00009
string numberFromStringFormat = string.Format("{0:F5}", number); //0.00009

Change 5 above to whatever level of precision you require.

Discouraging use of the var keyword and ternary if operator

I would always favour typing more code to make it more explicit, more readable and to ensure consistency in style throughout a software system. Minimising the bytes and lines needed to do something shouldn’t take preference over readability. My two pet hates in this regard are the var keyword and ternary (?) if operator.

I know var is just syntactical sugar and everything is still type safe, but for me it just moves C# in the direction of a non type safe language at least in regard to syntax style and personally I just don’t like using it. I spoke to another developer about it recently and he was very dogmatic that it is a good thing as its shorter and more concise. I agree in some instances that that can certainly be the case but because it’s not appropriate for all declarations such as:

var myVariable = System.IO.File.Open("test.txt", FileMode.Create);

or

var id = GetId();

it means a developer will either a) use var everywhere including in statements like the above where the type is in fact not obvious or b) use explicit declarations for statements like above and use var declarations for statements such as:

var names = new List<string>();

which means you either have many instances of variable declarations which are hard to understand or inconsistent coding style. If var is used at all another developer will no doubt come along and use it inappropriately so I prefer to discourage its use.

As far as ternary operator (?) ifs are concerned, again I prefer not to use them. I’d rather just use a standard multi-line if through the whole system, this way everything is explicit and the judgement call of whether the use of ? actually makes a particular if statement easier to understand or not is eliminated. I mean for simple expressions they can be neat but the problem is that in a team environment the precedent set by using them at all results in their overuse by less skilled developers. For example it definitely wouldn’t surprise me to see statements like the below:

int a = b > 10 ? c < 20 ? 50 : 80 : e == 2 ? 4 : 8;

pop up in a code base which has instances of ? already for simple expressions. Again then for reasons related to removing ambiguity about the appropriateness or not of its use, I discourage writing if statements with the ternary operator.

Code is read much more than its written so don’t save a couple of seconds using c# shorthand when writing it if it’s possible this will slow down those maintaining it.