Project
On a Solid datapod, you put data that you own and control. This implies that - unlike a traditional app - an app doesn’t need to host (or protect) the data anymore.
Personal data control does give rise to a new issue: What about data that we don’t want the user to be able to change?
Or rather: in an environment that is meant to be changeable, how can we prove to others that certain data is correct?
The problem is practically the definition case of "verifiable credentials" (VC's). VC's are currently not implemented into Solid but are certainly needed as they greatly expand the potential use-cases for the Solid ecosystem.
Solid will replace a lot of traditional ways in handling user data, giving the user more data ownership and control. This, combined with the ever increasing digitalization of the world, logically implies that there will be a higher need for verifiable credentials. User’s will need to digitally proof ownership and integrity of data.
To solve this issue in the philosophy of data ownership we would have to record and check for the integrity of data while keeping full user control.
A good place to store the proof for a dataset’s integrity would be a decentralized ledger, most known as blockchains. These ledgers provide an immutable network to safely store and read data. Using a public (OR private) ledger as the “source of truth” greatly increases the uptime availability, security and accessibility.
The goal of this internship was researching and combining Solid with Blockchain. This would solve the issue of verifiable credentials while keeping absolute user data-ownership without the need for an application to store any user data themselves.
Initially, I started researching existing projects and DLT’s with Solid support or libraries but the current Solid support for these were outdated or still in development. Although interesting, it was clear this wasn’t going to get me anywhere so I decided to use the Ethereum blockchain since this was also in line with my current development path.
Using a hash function, we can change any size of data to a deterministic single hash code while the process cannot be reversed. If we repeat this process with pairs of resulting hashes then we eventually get 1 hash as result.
Merkle hash tree
Using this method we can create a single hash code representing the integrity of data. Storing this hash on a blockchain gives us the opportunity to create a new hash from the same data and verify it against the corresponding hash on the chain. As such, the use-case of verifiable credentials (VC for short) emerged.
We use the example of a degree or certification with 3 entities; A university, a student, and an employer.
When designing the app architecture I Initially designed a basic system with 2 applications, one to create VC’s and one to verify them afterwards.
Learning Solid was a bit of a rough start since there aren’t a lot of documentations or guides out there and it's still heavy in development. Eventually I got the basics of Solid data handling working using libraries from Inrupt. Thereafter I added metamask support (an ethereum wallet) for the blockchain aspect.
I created a new React app to test out our solution. As the essential parts were;
- creating a merkle tree hash code from given data
- storing the hash on chain
- verifying with the hash on chain
- Storing the credential in the solid-pod
These core aspects worked more or less and at this time we also made some changes to the app architecture. Instead of using 2 applications we would have 3 applications, each for its respective entity that would use it in our example (Issuer, Holder, Verifier).
Each entity presents its credential by giving read-access to whomever it belongs to or needs to see.
- At first the issuer app is used for creating a credential, then stores the hash on the chain with a unique issuer owned Ethereum address. The issuer can verify the credential and then safely add it to its Solid-pod. The destined holder is automatically given read-access.
- Next the holder-app is used for reading this credential and creating a copy of it, storing it into the holder’s own Solid-pod. The holder is able to give read-access to any WebID (verifier).
- At last the verifier-app is used for verifying whether the claimed credential is in fact correct.
To make things easier I started 3 new applications and tried to develop them at the same time, since the core logic was already working in my first application. After coding the Issuer-app I could move the parts to be needed over to the holder-app and the verifier-app.
After a lot of tweaks, changes, improvements we eventually ended up with this app-architecture.
VC High-level App Architecture
With this system we can create and give verifiable credentials that are in a holder's control, and which they are motivated to not wrongfully alter their data so that their claim on the credential can be proven correct.
In the Solid philosophy we want as much if not all of the user-data in possession and control of the user itself. We only need to store a proof of user-data integrity somewhere else that is secure from manipulation.
Using the Ethereum network we can provide this safe storage without requiring the holder to have any interaction with it as only the issuer and the verifier have to.
The immutability and upkeep of a blockchain gives us the needed security while the publicness (in our case) gives the opportunity for third party apps to also verify credentials without the need for any issuer-specific API. And the whole setup still works in case the issuer-pod gets corrupted.
But there are still many changes and improvements to be made.
Currently the ethereum address used to create the transactions that store our hash codes is hard coded, this address would represent the Issuer and is needed by the verifying process in order to find the hash on chain and proof that the Issuer is valid.
To solve this we could make the verify-app read the specific ethereum address from the issuer-pod itself, but therefore the verifier would have to be able to trust the issuer by its webID.
Alternatively, we could hard-code an Issuer-admin who, with a corresponding ethereum address, could create credentials for other Issuers making them valid and true to our system if coded in.
What’s also subject to change is how we store the merkle tree resulting hash in the data log of an ethereum transaction. This seemed cleaner and less costly (although it doesn’t matter that much). But we’ll eventually have to change this into storing the hashes in a smart contract in order to be able to delete or better said deactivate credentials making them false in the verify process. This would also open up lots of other possibilities in the world of crypto as other smart contracts would be able to communicate with and alter these. As such we would also make the issuer able to store/deactivate multiple credentials in the same transaction as that would be the case in alot of use cases.
A transaction has a fee called gas which is paid in ethereum currency. This fee is based on multiple factors such as the complexion of the transaction, the amount of data passed along and network activity.
The gas cost for these transactions would seem to be high sometimes, but ongoing improvements and genius layer 2 scaling solutions makes it possible to have extremely low gas costs. So I do not worry about this aspect.
There would also be a possibility for adding expiration dates to credentials. In order to do this we would need to have an expiration date time added to the credential. Before we merkle hash tree our data, a check will be made first to see if the expiration date is not met yet. Then a positive boolean gets added to the data to be tree hashed alongside the expiration date.
Afterwards in our verification process the same will happen and in case that the expiration date has already occurred then a negative boolean will be added to the tree hashing. Resulting in a different integrity hash and so making the credential invalid.
A nice-to-have feature would be having everything GDPR conform. After the holder copies his credential, we could give him the option to delete the original credential on the issuer-pod. This way the issuer-pod would hold no user data whatsoever.