Pages

Tuesday, July 22, 2014

Programming Concepts: MVC

MVC or Model-View-Control structuring enables a developer to change various parts of a program without having to go through other certain parts. Or even, ideally, being able to re-use parts without having to rebuild everything from the ground up. The name of the game with MVC is a modular design. But to understand the structure, (and the overwhelming number of frameworks which utilize it,) you have to first understand the concepts behind each part.

Model -- This is best thought of as physical objects and their attributes. The same way you'd model an object in say, Blender, you can model an object's data. For this, you should ideally have good knowledge of Object Oriented Programming, as you will of course, be modelling objects.

View -- This is a scene to those who build games or a template to application developers. Essentially, what you're describing here is how to display the output data to the client. In many instances, you may build a different view for say, each level, or a set of levels. You would likely build one for various different menus and the like as well.

Control -- This is the bit responsible for connecting the model and the view. Say for example, you want every section of a site to have a view corresponding to its model to display the data as efficiently as possible? This is what makes that possible, in addition to being able to describe what should happen if you fail to find a view or even say, wish to differentiate between friends and non-friends on a social network and the data that's loaded from the model into the view. (Though largely, that check and return of data should be handled when modeling the target user object.)

So now that we have some idea of what MVC is, how is such structured in various places? Well, if you want to see an MVC structure broken down into its components, you don't have to look much further than what's available in pretty much any modern browser and the support of HTML/CSS/Javascript.

HTML is our Model and should define the data being loaded. A webpage should be able to be its own standalone document without the need for external structuring and the like, beyond what's served. For example, loading up header before main and also on-page navigation prior to content and content prior to footer.

CSS is our View and should define the way the data is presented. While HTML does have elements such as <div>, <span>, and even formatting elements like <b>, <i>, and <u>? These should largely be for formatting and marking up a document in a fashion to say, marking the text with specific purposes such as headers and foreign or emphasized words rather than being used as stylistic elements. CSS can and does provide the ability to style these elements in any way you desire.

Javascript then of course would be our controller. It can modify anything in the DOM both removing existing content and adding additional content. It can be used as a first layer of validation or even for animating various elements. It can even be used to the point of being a stand alone programming language for game development these days with frameworks such as Phaser or even mimicing the capabilities of a specialized language such as R.

Monday, July 14, 2014

Data Security: Properly Storing Passwords

One of the bigger things I've seen done wrong time and again throughout the years, would be people storing passwords improperly. (Even more in my last year as a consultant for my own short-lived firm, Ekohm.) That is to say, storing such sensitive data in an insecure manner, and in fact storing most sensitive data insecurely. What many forget is a simple rule... If you can read it, an attacker can read it.

First, let's cover some rules about what not to do...
  • Do not under any circumstances, EVER, transmit or store sensitive information in plaintext nor anything that's simply an easy conversion.
  • Do not store the data with a normal cryptographic algorithm. This is because encrypted data can be decrypted as easily as you encrypted it. Generally for secure encryption, you have a public and private key pairs. In the instance of simply encrypting on the server, all you're doing is creating with what is essentially a public key. So, in the event of an injection attack, let alone a breach, you're looking at all of that data being made available. An exception to this might be some kind of algorithm to mask email addresses or phone numbers and other potentially sensitive, but not quite as important as passwords, data. The advantage of this is stopping any meaningful data loss in the event of simple attacks like SQL injection.
  • In fact, don't store the user's password at all.
  • Do not reveal the user's securely stored password publicly, even to them. You shouldn't be able to in plaintext if you've done things properly, even if you wanted to do such. But you also shouldn't reveal any of the stored private data in general.
What you should be doing...
  • When transmitting data, enforce the use SSL/TLS. This should be done on pretty much any page where you're potentially going to be receiving data from users or visitors. You may throw the recent Heartbleed issue of OpenSSL around all you want, but that issue was addressed. (Granted, probably not as early as it should've been because apparently the US security community knew about it for 2 years or more prior. Take from that what you will.)
  •  Utilizing cryptographic HASHING algorithms. This is specifically true of when you store passwords. You don't need to ever know what another person's password is, it's a secret. This is why we use hashing algorithms instead of simple cryptographic algorithms. Hashes are one-way encryption, meaning it shouldn't be able to be reversed to reveal the original word. What you're doing is making sure the password is never unencrypted except for while it exists in memory. (AKA: Why heartbleed was bad.)
  • Utilizing a unique salt value for a user. A salt is used as an additive to additionally scramble the password. Usually for these I'll use a randomly generated binary value (stored as hexadecimal values) that changes when a user updates their password. Additionally, when storing the salt, I check to make sure its value isn't the same as any other user's salt. You must store the salt. This is your key and something of a public key.
  • When storing the password, you should be hashing it and then mixing it with the salt. The number of times you mix helps against the vulnerability in hashes, brute force attacks on the final binary hash value. The number of iterations is up to you, but the more you do it, the longer it will take for each additional password to be brute forced. This is what is referred to as stretching the stored hash. And the final result of this mixture is the second value you store that must be matched later, when a user tries logging in. You must store the stretched and salted password. The submitted password value will exist as the user's private key. Which combined with yours the specified number of times, will match the stored password value.
And that is how to securely store a password. I personally, currently favor a 512-bit hashing algorithm with at least 20,000 iterations to the final value that will be stored.

One last thing to keep in mind. No security is perfect, everything is ultimately vulnerable in some way. And it's just a matter of them finding the proper attack vector to exploit. So, you need to keep current on what's going on in the security world, in addition to the developer communities you involve yourself with.

Graceful Degradation (A Minor Add-on)

As I was fact checking my post the other day about graceful degradation, specifically in terms of Flash, I kept coming across people pushing Javascript as a method. Furthermore, pushing a function of jQuery to detect whether or not the plugin necessary for Flash is present and enabled.

My question then is, what happens when neither Flash nor Javascript are available? Then, do they have a noscript tag set to inform of the need of Javascript which informs of the need of Adobe’s Flash plugin? And this doesn’t seem ridiculous and like I’ll just go to another site, how?

The bottom line is that there’s no reason to ever use Javascript to detect Flash’s existence. Ever. Unless maybe, you're trying to save like 120 bytes from your HTTP reply, but that also seems kind of ridiculous since you're supposedly loading up the jQuery library to do such. And while it is unlikely, there are those who have both disabled, with their own reasoning.

Sunday, July 13, 2014

Control Structures: If Else, Else If

Building upon the knowledge of the if() statement, we have the else clause. Essentially, what else does is, if the if() evaluates to false, it runs some type of alternative code. In addition to this, we also have the else if() and in some languages elseif() or elif() which provide a second if() control for alternative outcomes that aren't binary style outcomes.

Encapsulated C / Java Style Pseudocode:
int var1 = 10;
if(var1 <= 5) {
    print('var1 is 5 or less.');
    return var1;
} else {
    print('var1 is greater than 5.');
    return var1;
}

Unencapsulated Basic / Python Style Pseudocode:
int var1 = 10
if(var1 <= 5):
    print('var1 is 5 or less.')
    return var1
else:
    print('var1 is greater than 5.')
    return var1

Outputs:
val1 is greater than 5.
10 (as a returned value)

To note before moving on, you don't always need an else to match your if statements. However, it's generally a good idea within a function that uses them to have some alternative type of return value for debugging purposes. In some instances, folks will use an application specific error code or simply return a boolean value of FALSE.

Next we'll do an example of else if in use. Again, it's going to do essentially the same thing as when we would use another if() statement, but it's going to only trigger if the preceding if() and elseif() statements evaluate to false, while it, itself, evaluates to true.


Encapsulated C / Java Style Pseudocode:
var1 = 'fish';
if(var1 <= 5) {
    print('val1 is 5 or less.');
    return var1;
} elseif(var1 == 'fish') {
    print('var1 is fish...');
    return var1;
} else {
    print('var1 is greater than 5.');
    return var1;
}

Unencapsulated Basic / Python Style Pseudocode:
var1 = 'fish'
if(var1 <= 5):
    print('var1 is 5 or less.')
    return var1
elseif(var1 == 'fish'):
    print('var1 is fish...')
    return var1
else:
    print('var1 is greater than 5.')
    return var1


Outputs:
val1 is fish...
fish (as a returned string value)

And that's pretty much all there is to know about if, elseif, and else statements. You can easily try them out on just about any fiddle-type site or in any language, assuming you've got the syntax correct for your specific language.

Control Structures: If Statements

In coding we need to control how a application and its components run, when they run, and IF they run. This is going to be an overview focusing on that last part of determining of a piece of code should run or not.

The basic idea of an if() statement is that if whatever expression/arguments you're passing within the if function evaluates to true, the next bit of code runs.

Encapsulated C / Java Style Pseudocode:
var1 = 1;
if(var1) {
    print('It\'s true!');
    return var1;
}
 
Unencapsulated Basic / Python Style Pseudocode:
var1 = 1
if(var1):
    print('It\'s true!')
    return var1

In the instance of Basic/Python syntax, indentation matters. So keep that in mind. That's also why I personally prefer languages with encapsulation for these types of things, but at the same time, if you begin nesting them? It can become difficult to tell where any one ends if you aren't indenting properly anyway. A nested if function is generally used when you wish to run code and further evaluate for  the execution of additional code. An example of nested if functions:


Encapsulated C / Java Style Pseudocode:
var1 = 1;
var2 = 1;
if(var1) {
    print('It\'s true!');
    if(var2) {
        return var1;
    }
}
 
Unencapsulated Basic / Python Style Pseudocode:
var1 = 1
var2 = 1
if(var1):
    print('It\'s true!')
    if(var2):
        return var1

And that's pretty much all there is to know about if statements. You can easily try them out on just about any fiddle-type site or in any language, assuming you've got the syntax correct for your specific language.

Number Sets: Binary

The most base of any of the machine languages, binary. There isn't much to say and yet, everything you see here is made up of a complex combination of 0s and 1s influencing, essentially, switches and gates controlling the electricity within the components.

So, quite the contrary, you could easily give an entire semester long class on binary alone and how it influences various types of hardware. The types of control structures and commands these strings of two digits combine into to create instruction in the form of human readable Assembly (usually x86 syntax.)

But, what is the number set? Simply this...
  • 0 = 000
  • 1 = 001
  • 2 = 010
  • 3 = 011
  • 4 = 100
With each significant digit doubling the previous one's value, on into the largest number you desire to make by adding the place values together. Like decimal, octal, or hexadecimal, the number is only a representation of a concept. In the case of binary, it's TRUE/FALSE or ON/OFF, no gray areas, simple as can be and creates the basis for Boolean logic, which forms the most basic of machine and electrical engineering logic. This logic, allows those who create hardware to create various types of gates and switches similar to the logic we use in higher level programming, only with essentially sticks and stones level technology.

For example, the most significant bit in a binary string may for example pass a series of logic gates in a specific order that allows for a payload of lesser significant digits to assign say, a memory address to which the remaining value is sent by the processor to be stored within active memory for retrieval by another command with similar designs, that instead activate gates which cause the release/retrieval of such.

Each digit of binary is called a bit. Eight bits make up a byte, and half a byte is called a quartet. And every key you press is sending the value of a single byte, each character here is a single byte. For example, the most common character, "e"? Equates a binary value of 0110 0101 or 101, 0x65, and o145 in their respective numerical systems.

This is why there's a major difference between a megabyte and a megabit, as seen in connection speeds and in the terrabytes/terrabits level with hard drives now. A kilobyte equals 1024 bits, a kilobit equals 1000 bits. So, the difference between a megabit and megabyte are a thousand fold from there, for example: 1,000,000 bits vs 1,024,000 bits... That means for every megabit per second you get versus megabyte per second, you're losing approximately 24KB/s in terms of speed. (Just a heads up because some shady hardware people like to tell you drive sizes in bits and not bytes when abbreviating. Thus why your new 1TB drive is a few GB shy of what was promised...)

Perhaps something interesting in the regard of characters though, is the 1 and 0 you're seeing don't equate 0000 0001 and 0000 0000, but their character values of 0011 0001 (0x31) and 0011 0000 (0x30) respectively. This is because you aren't sending a value to return, but the value that calls the character, which then can also be correlated to a font file to write the specific data of say, an Arial typeface character that then outputs its data back through the hardware and eventually output on the screen.

And that, is an overview of binary and how it makes everything just... work.

Number Sets: Octal

Octal is one of those weird things in computer science that exists, but we don't really use a lot. Well, something we don't use a lot anymore, because it was the stopgap between pure binary and hex. Welcome to one reason to potentially loathe IBM. Instead of having a quartet in binary like we do with hexadecimal, we get triplets. So, here's the break-down between decimal, binary, and octal.
  • 0 = 000 = 0
  • 1 = 001 = 1
  • 2 = 010 = 2
  • 3 = 011 = 3
  • 4 = 100 = 4
  • 5 = 101 = 5
  • 6 = 110 = 6
  • 7 = 111 = 7
Again, we begin counting with 0. But you'll notice something immediately disadvantageous about using octal over hex. Not only are you manipulating arguably small chunks, you're also not getting the full breadth of decimal, which is the number system used by the majority of the human world's civilizations.

Another thing is, if you remember how hex worked, with 0xFF = 1111 1111? 8 bits make one byte and thus a hex couplet perfectly represents a single byte. Where in octal we have triplets and thus would require an extra digit to make a byte, o377 = 011 111 111 = 0xFF = 1111 1111.

So, from there you can see the issue, I hope. For each byte you lose the most significant bit's value and thus, you continue losing value as your number gets larger. The larger the number, the larger the potential loss. But then, why didn't we structure around an octet of octets since that would fit into the byte structure? Which would you rather...


o77777777 = 111 111 111 111 111 111 111 111
OR
0xFF FF FF = 11111111 11111111 11111111

Keep in mind the former is ludicrous because you'd need to make three bytes to make a whole one without losses, but at the same time you're processing two more values at a time than you are with hex. Where, what do you do if you want to represent then, half a byte? With hex you'd need 0xF, but with octal you'd need o17 and you lose the two most significant bits. It only gets half as bad when representing the aforementioned o377 (011 111 111) byte. As you can see, you lose the most significant bit with each you create until you create a multiple of three.

Confusing yet? So you count to 7, which isn't quickly translatable to decimal unless you have it memorized or have computer assistance. (IE: o12 = 10, o144 = 100, o1750 = 1000) Then for any full bytes you'd like to create, you'll need to create three, if you don't want to incur losses of the 1 to 2 most significant bits.