Sunday, March 11, 2012

Case Sensitivity and Issues of Usability

Once upon a time, when I had up until then only ever worked with Windows based computers, I discovered the world of Linux/Unix through Cygwin (as a result of needing or rather wanting to run GNU LilyPond, a story in itself). At this point in time, I also discovered the issue of case sensitivity, or rather the lack of it in Windows (but its presence in Linux).

I can't remember the exact reason(s), but at the time, I found this really cool, and wondered why this wasn't the case on ALL systems.

Perhaps it was because I was drawn to the possibility of being able to have a set of items with similar names but without having to compromise the names so that they wouldn't bugger each other up, or some other variation on this theme which would've simplified various file renaming exercises. Or perhaps it was the notion that perhaps I'd be able to have 26 more version indicators at my disposal (NOTE: this was still back in the days when I'd never touched a decent version control system, and if I had, it would've ended up being CVS).

Fast forward a few years. In the intervening years, I've grown increasingly familiar with operating computers via the command line (admittedly, there are still many things I've yet to master, and there are still things that I'd do via a GUI any day simply as it seems less destructive and error prone). During this time, I've also spent a decent amount of time working with real Linux systems, not just Cygwin "hosted environments" or live CD's. However, at the end of the day, for every hour with a Linux system during this time, I've probably spent 10 or more on one of my Windows based machines at home.

After working with the Windows command line for so long, let's just say that a few habits have been developed. (Despite the bagging that it often gets, the Windows command line actually isn't half as bad as it is often made out to be; as with everything else, it's just a matter of whether you care enough to try and customise it until it works right.) In particular, trying to go from a Windows command line to a Linux one raises several issues:
  1. Linux makes a distinction about calling executable things in the current directory vs from various global paths. That is, that annoying thing where you must slap a big ugly "./" in front of any filename to execute it in the current directory. I've read numerous things about why this is so in the past, but I really never understood the rationale (and still don't or can't remember/care to now!). What on earth is wrong with trying the current directory as a last resort if you can't find something in the global search paths?! (NOTE: I'm fine with doing this when trying to force execution to use the local copy. But in that case, I'm trying to force/override default behaviour since I know it will be wrong)
  2. Linux often cares whether you've slapped the right extension on an executable when trying to run it. Frankly, I can't care less about the extension 99% of the time; just run whatever looks most runnable, unless there are like 10 things with a similar name (sans extension) which could all be reasonable to run (i.e. been marked as executable, and extensions suggest this).
  3. Case sensitivity in Linux is an absolute PITA when trying to quickly get something to run. For instance, I'm sure everyone knows how much easier it is to just type everything in lowercase, since you don't need to spend time coordinating fingers (or otherwise) to hold down shift. Now, when navigating directory structures with multiple levels, the last thing you want is to have directories with uppercase letters on a case sensitive terminal, especially if these uppercase letters are scattered throughout the names at various levels, since, even if you use tab-completion, you'd still be typing away for a while. In fact, you'd even struggle to remember the exact directory names at times. 
Side note: these problems are at their worst when dealing with SSH terminals when you have no visual fallbacks to check in the middle of trying to conjure up some path names. Due to laziness, I'm increasingly moving towards shallow file systems with really short names for such use cases.
This last point (re case sensitivity) is the main issue of this post. Although I once thought that case sensitive stuff was great, especially when trying to enforce some naming schemes for my files, I've come to realise that in many many more cases, it's actually a nuissance for precisely the reasons mentioned. When trying to get things done fast, trying to remember and type the precise permutation of uppercase letters required is a pain. And let us not forget those who aren't that great at remembering long chains of stuff (aka most of the world's population, as opposed to many geeks who are stereotypically predisposed to having great memories for exact names of a lot of obscure stuff).

For this reason, it's no secret that most of the search engines and capabilities these days default to being case insensitive. Yes, it's much simpler to just do exact matches (case sensitive). But case sensitivity really is a rare use case (i.e. when we actually know the specific case of what we're searching for, and the search results are being too vague/inclusive to be useless), not the norm. I must also note for anyone reading this (in case they do implement a search box in future), that I've always believed that it's evil that search engines require users to specify various concoctions of boolean operators and whatnot domain-specific-languages in their query strings so that search engines can do anything useful (though most these days work well enough without needing this, except in occasional cases when some prodding is needed).

So, what should we be doing about all this?
- Case sensitivity for filesystems is still great to have. I'd argue that the annoyance factor of NOT being able to rename a file or so just because the filenames would otherwise be the same without case sensitivity is not worth it.
- Command lines, search tools, and basically any tool where users are required to provide textual input of any kind - and especially for quick access to stuff - should be case insensitive to maximise efficiency and decrease annoyance
- Where there are multiple options possible due to the different permutations of uppercase letters, all of these should get presented to the user to clarify which one was intended. This goes for command lines too or in particular; autocomplete things already allow for similar capabilities already, so I don't see why not!
- In the case of hotkeys, these should NEVER be case sensitive. Hotkeys should be considered to be based on which key on the keyboard is pressed, not the symbol which we happen to be able to get out of the keyboard

----

As a closing note, here is a short list of things where the Linux command line is nicer:
  1. Multi-coloured output. Different parts of text can be different colours
  2. Forward slashes only. Most things on Windows cope well with forward slashes (and I tend to just use forward slashes everywhere these days, unless for whatever reason something balks, in which case I have to do platform checking hacks). However, the single thing that I've found so far which copes awfully with this is that the Command Prompt only knows how to execute and/or autocomplete paths written using backslashes. Grr!
  3. Default buffer/window sizes. In Linux, I've found these practically limitless (?), but in Windows, I always end up needing to manually preconfigure all my command prompts to have width 200 and height 9000 (or higher).

12 comments:

  1. If you really think it's a pain to type "./" when executing files from current directory, just append it to your PATH.

    export PATH=$PATH:./

    This would give the behaviour you ask for. Add it to your shell's startup script and you'll never have to think about it again.

    ReplyDelete
  2. Cool. I'll have to remember to try that next time I'm working on one of those boxes.

    ReplyDelete
  3. Thanks Johnny for the tip.

    Some commands like 'find' and 'grep' have an -i option that makes them case-insensitive.

    Case sensitivity can also be a real time-saver. For instance, in a source code archive, it makes it faster to read the Makefile, the README or the INSTALL file, because only their initial is needed for autocompletion.

    Personnally, I use capitalized names for folders I use often, and sometimes to "protect" files against accidental deletion. For instance, in a folder containing blender renders, if i have:

    Myfinalrender.png
    mytemporaryrender0001.png
    .....
    mytemporaryrender0250.png

    Then the former will be safe against rm -m .

    ReplyDelete
  4. (Sorry, I meant rm m*)

    ReplyDelete
  5. You can make Bash's tab completion case insensitive:
    http://www.cyberciti.biz/faq/bash-shell-setup-filename-tab-completion-case-insensitive/

    I always do this on a new Linux install these days.

    ReplyDelete
    Replies
    1. Interesting... Will have to try this out some time. One more thing to add to my customisation lists then!

      Still, it's slightly annoying that I'd need to go through tab completion instead of just typing everything lowercase, though it's a lot less annoying than the default behaviour I believe.

      Delete
    2. Who are you kidding? Tab completion saves a whole bunch of typing and time since you only have to type enough letters to uniquely identify the folder - this allows you to have long folders that fully describe what you are working on and still be able to get through them quickly on the command line.

      Having worked in a studio with Windows based Artists and Linux based Fileserver and some tools that are case insensitive I personally think that case insensitive filesystems are better for the majority of users. Our artists have been trained to keep there filenames consistant but It wasn't easy to do.

      Delete
  6. Linux actually doesn't care about the extension at all, wether a file can be run is determined by the 'executable' bit in the permissions setting. What will happen if you run it depends on the contents of the file.

    ReplyDelete
    Replies
    1. I'm aware of this, but at the same time, have run into problems where just leaving off the extension is often enough to make the shell vomit in your face about unknown files.

      EDIT: just tried this via ssh to confirm. Yes, Linux DOES care that you put an extension on the filename when trying to execute. Otherwise it complains about unknown files.

      Delete
  7. I'e never heard of *nix caring about filename extensions, I don't think I have saved any scripts with an extension - in fact look through your system and you'll find more scripts than you think without any filename extensions - apachectl - apxs - try all the rc.d scripts - what matters is the permissions need to have execute enabled - or for your own scripts just add sh in front of the filename (or perl/python etc). While some linux distro's may add the need for extensions I am thinking it more likely that uni admin have added it - apart from ~/.bashrc and ~/.profile the global /etc/profile may also enforce something like that.

    With adding ./ to PATH I prefer to put it at the start so the local file is chosen before a system file of the same name. I also have a ~/bin folder with my personal scripts collection.

    With tab completion did you know you can get a list of matches? In bash double-tab to see the list in csh 'set autolist' to turn it on.

    You may be interested in using CDPATH (setup you can have cd blender take you to ~/projects/blender-trunk/blender) also you may find http://paksoy.net/post/158766842/easy-bash-complete of interest.

    ReplyDelete
  8. With respect to file extensions:
    Windows treats file extensions as a "special" part of the file name, whereas most *nix shells just treat file extensions the same as any other part of the file name. That means that, yes, in *nix you need to type the "file extension", because it's part of the file name just as much as any other part.

    Personally, this jives just fine with me, and I've never had any issues with it. Tab completion makes it moot, IMO.

    I agree that the ./ thing is odd. But it's never bothered me too much.

    ReplyDelete
  9. Lawrence D’OliveiroMarch 17, 2015 at 5:00 PM

    Don’t put “.” in your PATH. Back in the day, everybody did. Then it was realized what a security hole it was...

    ReplyDelete