Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Project.json comparison with XML equivalent
{
"authors": [
"Sam Saffron",
"Marc Gravell",
"Nick Craver"
],
"owners": [
"marc.gravell",
"nick.craver"
],
"projectUrl": "https://github.com/StackExchange/dapper-dot-net",
"licenseUrl": "http://www.apache.org/licenses/LICENSE-2.0",
"summary": "A high performance Micro-ORM",
"description": "A high performance Micro-ORM supporting SQL Server, MySQL, Sqlite, SqlCE, Firebird etc..",
"version": "1.50-beta9",
"title": "Dapper dot net",
"tags": [
"orm",
"sql",
"micro-orm"
],
"copyright": "2015 Stack Exchange, Inc.",
"dependencies": {},
"releaseNotes": "http://stackexchange.github.io/dapper-dot-net/",
"compilationOptions": {
"warningsAsErrors": true
},
"frameworks": {
"net40": {
"frameworkAssemblies": {
"System.Data": "4.0.0.0",
"System.Xml": "4.0.0.0",
"System.Xml.Linq": "4.0.0.0"
}
},
"net45": {
"compilationOptions": {
"define": [
"ASYNC"
]
},
"frameworkAssemblies": {
"System.Data": "4.0.0.0",
"System.Xml": "4.0.0.0",
"System.Xml.Linq": "4.0.0.0"
}
},
"net451": {
"frameworkAssemblies": {
"System.Data": "4.0.0.0",
"System.Xml": "4.0.0.0",
"System.Xml.Linq": "4.0.0.0"
},
"compilationOptions": {
"define": [
"ASYNC"
]
}
},
"netstandard1.3": {
"compilationOptions": {
"define": [
"ASYNC",
"COREFX"
]
},
"dependencies": {
"NETStandard.Library": "1.5.0-rc2-24015",
"System.Data.SqlClient": "4.1.0-rc2-24015",
"System.Dynamic.Runtime": "4.0.11-rc2-24015",
"System.Reflection.Emit": "4.0.1-rc2-24015",
"System.Reflection.Emit.Lightweight": "4.0.1-rc2-24015",
"System.Xml.XmlDocument": "4.0.1-rc2-24015",
"System.Collections.NonGeneric": "4.0.1-rc2-24015",
"System.Reflection.TypeExtensions": "4.1.0-rc2-24015"
}
}
}
}
<project
authors="Sam Saffron, Marc Gravell, Nick Craver"
owners="marc.gravell, nick.craver"
projectUrl="https://github.com/StackExchange/dapper-dot-net"
licenseUrl="http://www.apache.org/licenses/LICENSE-2.0"
summary="A high performance Micro-ORM"
description="A high performance Micro-ORM supporting SQL Server, MySQL, Sqlite, SqlCE, Firebird etc.."
version= "1.50-beta9"
title= "Dapper dot net"
tags= "orm,sql,micro-orm"
copyright= "2015 Stack Exchange, Inc."
>
<releaseNotes>http://stackexchange.github.io/dapper-dot-net/</releaseNotes>
<compilationOptions warningsAsErrors="true"/>
<framework moniker="net40">
<assembly name="System.Data" version="4.0.0.0"/>
<assembly name="System.Xml" version="4.0.0.0"/>
<assembly name="System.Xml.Linq" version="4.0.0.0"/>
</framework>
<framework moniker="net45">
<compilationOptions define="ASYNC"/>
<assembly name="System.Data" version="4.0.0.0"/>
<assembly name="System.Xml" version="4.0.0.0"/>
<assembly name="System.Xml.Linq" version="4.0.0.0"/>
</framework>
<framework moniker="net451">
<compilationOptions define="ASYNC"/>
<assembly name="System.Data" version="4.0.0.0"/>
<assembly name="System.Xml" version="4.0.0.0"/>
<assembly name="System.Xml.Linq" version="4.0.0.0"/>
</framework>
<framework moniker="netstandard1.3">
<compilationOptions define="ASYNC,COREFX"/>
<dependency name="NETStandard.Library" version="1.5.0-rc2-24015"/>
<dependency name="System.Data.SqlClient" version="4.1.0-rc2-24015"/>
<dependency name="System.Dynamic.Runtime" version="4.0.11-rc2-24015"/>
<dependency name="System.Reflection.Emit" version="4.0.1-rc2-24015"/>
<dependency name="System.Reflection.Emit.Lightweight" version="4.0.1-rc2-24015"/>
<dependency name="System.Xml.XmlDocument" version="4.0.1-rc2-24015"/>
<dependency name="System.Collections.NonGeneric" version="4.0.1-rc2-24015"/>
<dependency name="System.Reflection.TypeExtensions" version="4.0.1-rc2-24015"/>
</framework>
</project>
Owner

darrelmiller commented May 11, 2016

The JSON version is 51 bytes smaller (or about 2%), primarily due to the repetition of the element names "assembly" and "dependency". Additional bytes could be saved at the cost of readability, but any kind of compression is going make the differences negligible. However, using the default formatting for JSON documents, the XML document is 30 lines shorter. That's 38% shorter.

I had to take a liberty with the simple value arrays of authors and owners. XML doesn't represent these kind of arrays in a compact way. However, .NET is quite adept at converting a comma delimited list into an array, so using a comma delimited list to represent an array is not much different than using DateTime.Parse() to pull a date out of JSON file. I am making the assumption that the comma is not a valid character in the array values. For this particular case, it seems like a reasonable assumption.

The project.json file makes heavy use of JSON property names as data values, which is something that doesn't translate very cleanly into XML. Using XML element names to represent data values is nasty thing to do. Instead, I created an element to represent each value in the hash. The advantage of being forced to do this is that it becomes really easy to add one or more optional attributes to the "assembly" or "dependency" elements. In the JSON document, if more information needs to be stored then there needs to be a breaking change where the string value is changed to an object. And then it needs to be decided if that should be done uniformly or only as required.

I am NOT trying to make the case that XML is a better format for this use-case than JSON. I am pointing out that there are pros and cons to both formats and I believe it is unreasonable to automatically assume that JSON is smaller, more readable, or more editable.

Owner

darrelmiller commented May 11, 2016

Oops, made a mistake in the XML by leaving the duplicate releasenotes attribute. The XML is now 15 bytes smaller than the JSON ;-)
And another advantage of the XML releasenotes element is that I can do multi line release notes without having to encode the newlines.

attilah commented May 11, 2016

While in JSON every array/value is machine readable, in this XML version a few lists are forced into "multi value" comma separated XML attributes, which is not the best IMHO. That can be one drawback. Forcing every list type values into XML elements can be a better way.

Owner

darrelmiller commented May 11, 2016

@attilah I agree, the comma separated value is not ideal. However, doing a String.Split() to create the array is not very difficult for the parsing code to do. For people searching for values using JSON tooling, doing a substring search would be sufficient to be able to identify a particular author/owner.
One alternative would be to simply allow elements within the root, e.g.

<author name="Sam Saffron/>
<author name="Marc Gravell owner="true" username="marc.gravell"/>
<author name="Nick Craver" owner="true" username="nick.craver"/>

This has the added benefit of being able to add additional attributes at a later point.

Here is are two example project.yaml, one created by an online JSON -> YAML converter and the other hand authored to be as small as possible (without impacting readability). The converted one is ~3% smaller than JSON and ~13% smaller than XML while the hand written one is ~7% smaller than JSON and ~17% smaller than XML. I'm not sure if this is useful but I see a number of structured document tools switching to, or at least allowing, YAML these days so I thought I'd bring it up. Not only that but from a consumption perspective, every language has a YAML parser these days.

Online Converted (overly cautious wrapping of strings)

---
  authors: 
    - "Sam Saffron"
    - "Marc Gravell"
    - "Nick Craver"
  owners: 
    - "marc.gravell"
    - "nick.craver"
  projectUrl: "https://github.com/StackExchange/dapper-dot-net"
  licenseUrl: "http://www.apache.org/licenses/LICENSE-2.0"
  summary: "A high performance Micro-ORM"
  description: "A high performance Micro-ORM supporting SQL Server, MySQL, Sqlite, SqlCE, Firebird etc.."
  version: "1.50-beta9"
  title: "Dapper dot net"
  tags: 
    - "orm"
    - "sql"
    - "micro-orm"
  copyright: "2015 Stack Exchange, Inc."
  dependencies: {}
  releaseNotes: "http://stackexchange.github.io/dapper-dot-net/"
  compilationOptions: 
    warningsAsErrors: true
  frameworks: 
    net40: 
      frameworkAssemblies: 
        System.Data: "4.0.0.0"
        System.Xml: "4.0.0.0"
        System.Xml.Linq: "4.0.0.0"
    net45: 
      compilationOptions: 
        define: 
          - "ASYNC"
      frameworkAssemblies: 
        System.Data: "4.0.0.0"
        System.Xml: "4.0.0.0"
        System.Xml.Linq: "4.0.0.0"
    net451: 
      frameworkAssemblies: 
        System.Data: "4.0.0.0"
        System.Xml: "4.0.0.0"
        System.Xml.Linq: "4.0.0.0"
      compilationOptions: 
        define: 
          - "ASYNC"
    netstandard1.3: 
      compilationOptions: 
        define: 
          - "ASYNC"
          - "COREFX"
      dependencies: 
        NETStandard.Library: "1.5.0-rc2-24015"
        System.Data.SqlClient: "4.1.0-rc2-24015"
        System.Dynamic.Runtime: "4.0.11-rc2-24015"
        System.Reflection.Emit: "4.0.1-rc2-24015"
        System.Reflection.Emit.Lightweight: "4.0.1-rc2-24015"
        System.Xml.XmlDocument: "4.0.1-rc2-24015"
        System.Collections.NonGeneric: "4.0.1-rc2-24015"
        System.Reflection.TypeExtensions: "4.1.0-rc2-24015"

Hand Authored (removed unnecessary string wrapping)

---
  authors: 
    - Sam Saffron
    - Marc Gravell
    - Nick Craver
  owners: 
    - marc.gravell
    - nick.craver
  projectUrl: https://github.com/StackExchange/dapper-dot-net
  licenseUrl: http://www.apache.org/licenses/LICENSE-2.0
  summary: A high performance Micro-ORM
  description: A high performance Micro-ORM supporting SQL Server, MySQL, Sqlite, SqlCE, Firebird etc..
  version: 1.50-beta9
  title: Dapper dot net
  tags: 
    - orm
    - sql
    - micro-orm
  copyright: 2015 Stack Exchange, Inc.
  dependencies: {}
  releaseNotes: http://stackexchange.github.io/dapper-dot-net/
  compilationOptions: 
    warningsAsErrors: true
  frameworks: 
    net40: 
      frameworkAssemblies: 
        System.Data: 4.0.0.0
        System.Xml: 4.0.0.0
        System.Xml.Linq: 4.0.0.0
    net45: 
      compilationOptions: 
        define: 
          - ASYNC
      frameworkAssemblies: 
        System.Data: 4.0.0.0
        System.Xml: 4.0.0.0
        System.Xml.Linq: 4.0.0.0
    net451: 
      frameworkAssemblies: 
        System.Data: 4.0.0.0
        System.Xml: 4.0.0.0
        System.Xml.Linq: 4.0.0.0
      compilationOptions: 
        define: 
          - ASYNC
    netstandard1.3: 
      compilationOptions: 
        define: 
          - ASYNC
          - COREFX
      dependencies: 
        NETStandard.Library: 1.5.0-rc2-24015
        System.Data.SqlClient: 4.1.0-rc2-24015
        System.Dynamic.Runtime: 4.0.11-rc2-24015
        System.Reflection.Emit: 4.0.1-rc2-24015
        System.Reflection.Emit.Lightweight: 4.0.1-rc2-24015
        System.Xml.XmlDocument: 4.0.1-rc2-24015
        System.Collections.NonGeneric: 4.0.1-rc2-24015
        System.Reflection.TypeExtensions: 4.1.0-rc2-24015

I love the yaml version

JSON formatting is not quite good on this. That's one of the reasons why you see a big diff. See this: https://gist.github.com/tugberkugurlu/2ea39144ac9a4ec1c53153340704744a

I don't think neither JSON nor XML is the most optimum config file format. YAML is my preference. At the end of the day, this boils down to preference but I don't think that many of the eco systems use XML for config stuff. This should give a clue about its place.

Great one @whitlockjc! YAML version is so much much better

A toml version, that can be as terse and readable as the yaml version but doesn't have relevant whitespaces

authors = ["Sam Saffron", "Marc Gravell", "Nick Craver"]
owners = ["marc.gravell", "nick.craver"]

projectUrl = "https://github.com/StackExchange/dapper-dot-net"
licenseUrl = "http://www.apache.org/licenses/LICENSE-2.0"
summary = "A high performance Micro-ORM"
description = "A high performance Micro-ORM supporting SQL Server, MySQL, Sqlite, SqlCE, Firebird etc.."
version = "1.50-beta9"
title = "Dapper dot net"
tags = ["orm", "sql", "micro-orm"]
copyright = "2015 Stack Exchange, Inc."
releaseNotes = "http://stackexchange.github.io/dapper-dot-net/"

[compilationOptions]
  warningsAsErrors = true

[dependencies]

[frameworks]
  [frameworks.net40]
    [frameworks.net40.frameworkAssemblies]
      "System.Data" = "4.0.0.0"
      "System.Xml" = "4.0.0.0"
      "System.Xml.Linq" = "4.0.0.0"

  [frameworks.net45]
    [frameworks.net45.compilationOptions]
      define = ["ASYNC"]

    [frameworks.net45.frameworkAssemblies]
      "System.Data" = "4.0.0.0"
      "System.Xml" = "4.0.0.0"
      "System.Xml.Linq" = "4.0.0.0"

  [frameworks.net451]
    [frameworks.net451.compilationOptions]
      define = ["ASYNC"]

    [frameworks.net451.frameworkAssemblies]
      "System.Data" = "4.0.0.0"
      "System.Xml" = "4.0.0.0"
      "System.Xml.Linq" = "4.0.0.0"

  [frameworks."netstandard1.3"]
    [frameworks."netstandard1.3".compilationOptions]
      define = ["ASYNC", "COREFX"]

    [frameworks."netstandard1.3".dependencies]    
      "NETStandard.Library" = "1.5.0-rc2-24015"
      "System.Collections.NonGeneric" = "4.0.1-rc2-24015"
      "System.Data.SqlClient" = "4.1.0-rc2-24015"
      "System.Dynamic.Runtime" = "4.0.11-rc2-24015"
      "System.Reflection.Emit" = "4.0.1-rc2-24015"
      "System.Reflection.Emit.Lightweight" = "4.0.1-rc2-24015"
      "System.Reflection.TypeExtensions" = "4.1.0-rc2-24015"
      "System.Xml.XmlDocument" = "4.0.1-rc2-24015"

If handed editing is the aim, I've always said YAML or TOML beat XML or JSON. JSON is the worst of all.

nbxx commented May 12, 2016

I like .ini file, just kidding :-)

<project
...
  summary="A high performance Micro-ORM"
    >
  <releaseNotes>http://stackexchange.github.io/dapper-dot-net/</releaseNotes>

Summary is an attribute, releasenotes is a element. These inconsistenties are bad and comes with the choice of XML. And then there is too much clutter (e.g. name= in the example), which makes the XML the worst of all.

The XML is now 15 bytes smaller than the JSON ;-)

Consistency and no noise is IMO more important.

This is a great discussion and examples! While format is important, what is more important to me is a strong backing model . I would rather see the project "schema" be a Project POCO that has a well-known/defined class structure which the file is based.

FWIW, I have created an issue around this here:
dotnet/roslyn-project-system#37

And also there is an open discussion in MSBuild around formats here and project improvements here:
Microsoft/msbuild#613

I think XML is not bad. Parsing and tooling (for intellisense and stuff) should be easier to implement.

While XML is more verbose it has the advantage of being more descriptive. XML is also a better archival format. JSON is intended to be lightweight and ephemeral such as when passing small bits of data on a web service. If JSON is really a better format, then not only is Microsoft doing it wrong with .csproj and XAML but Apple is doing it wrong with Interface Builder and Google is doing it wrong with layout files. The W3C is even doing it wrong with HTML. At some point, archived files will need to be read by a human and XML is better suited for that. Just my two cents.

@whitlockjc, that hand edited YAML example almost brings me to tears of joy. If that was how project files in Visual Studio looked like, and the .sln file just disappeared (or turned into YAML as well), I'm sure my life would be lengthened by a few years. I at least know that currently, the horror of .sln, .csproj and MSBuild as a whole, is shortening my life expectancy due to heightened stress levels by a lot. I'm gnashing teeth just writing that last sentence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment