MRobertEvers/WebDevNotes.md

## WebDevNotes.md

      
    Raw
  

              WebDevNotes.md
            
          
    Web Devolopment Notes

Notes on the Buzzword soup of web development.
JavaScript


JavaScript is single threaded in the web browser. *Web Workers provide a threading-like interface
Operates on event loop. https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop
What does the new operator do? https://stackoverflow.com/questions/6750880/how-does-the-new-operator-work-in-javascript

JavaScript Event Loop

https://www.youtube.com/watch?v=8aGhZQkoFbQ
Javascript is basically a language interpreter with a stack and a heap. It is single threaded. Asynchronous programming is acheived by the libraries provided to it by the environment. E.g. setTimeout is offered by the web browser and accessible in Javascript. In that case, the Web Browser performs the input action after a given amount of time by placing the action into the Javascript event queue. When the main thread (read 'stack') is empty, the event loop places the callback onto Javascript's stack, and Javascript executes it.
In environments like node.js, it offers libraries for file system and http usage. Those node 'native' libraries are offered by the node environment, and may be multithreaded, however, the multithread control is not accessible in Javascript.
HTML Specification Stuff

https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Flow_Layout/Block_and_Inline_Layout_in_Normal_Flow
https://developer.mozilla.org/en-US/docs/Web/CSS/Visual_formatting_model
CSS Notes

CSS is included in an html page via
<link rel="stylesheet" href="<url>">

Relative URLs starting WITHOUT a '/' replace the last level in the url.
E.g. If you're viewing page http://mywebsite.com/cat/dog/hen.html, and the href is href="css/hen.css", the browser will search
http://mywebsite.com/cat/dog/css/hen.css

Relative URLs starting WITH a '/' replace the entire url.
E.g. If you're viewing page http://mywebsite.com/cat/dog/hen.html, and the href is href="/css/hen.css", the browser will search
http://mywebsite.com/css/hen.css


Browsers sometimes add space between elements, e.g. buttons. https://stackoverflow.com/questions/43431856/remove-blank-spaces-between-buttons-in-html-css
#parent-container {
  font-size: 0;
}

button {
  font-size: 14px; // Set font size that you need here
}

Node

Node.js is the JavaScript Run-Time removed from the browser. It has file io and other features that youd expect from a "server" language.
Node.js offeres the require('module') function. This function exposes everything specified in the module.export object of the targeted module if one exists.
Debugging Client and Server

To debug chrome, get the Debugger for Chrome extension for VSCode. VSCode allows launch configurations. To add a launch configuration if there are none, simply try to launch a Chrome debug session. Otherwise, when looking at launch.json, there is a button to add a configuration. Node.js and Chrome require separate debug configurations (as they are separate run-time processes), but VSCode allows you to launch multiple debug configurations in one command so you can debug Client and Server code. Chrome debugger readme here. Node.js debugger is fairly straight forward. See example code for multiple configurations launching, aka compounds.
{
    "version": "0.2.0",
    "configurations": [
        {
            "type": "node",
            "request": "launch",
            "name": "LaunchServerDebug",
            "program": "${workspaceFolder}/server.js"
        },
        {
            "type": "chrome",
            "request": "launch",
            "name": "LaunchChromeDebug",
            "url": "http://localhost:5000/static/index.html",
            "webRoot": "${workspaceFolder}/static"
        }
    ],
    "compounds": [
        {
            "name": "Server/Client",
            "configurations": ["LaunchServerDebug", "LaunchChromeDebug"]
        }
    ]
}

Express

Express is a Node.js library that helps route http requests to callbacks.
The express app itself is itself a requestListener for nodes built-in HTTP class http. The express App receives an http.IncomingMessage object from the request event. The HTTP server provides this object that express uses, seen in its many res names in its middleware.
Babel

Babel is a JavaScript transpiler. (buzzword for compiler to same language) EMCAScript is the technical official name for JavaScript, EMCAScript 6 is not yet fully supported on browsers. Babel can take EMCAScript 6 and compile it to EMCAScript 5 which is supported by browsers.
Babel can take extensions that run other code-processors on the code. Babel natively supports EMCAScript 6 to 5, and supports plugins for compiling JSX and WebPack.
Webpack

Since browsers must be served the JS code in complete runable source code, it is ideal to make the source code as small as possible before serving it. This is what webpack does. Often this is indicated by a bundle.js. So if you see a bundle.js, that means it has probably been webpacked.
Webpack allows the use of import and export in non-ES6 code. It automatically acts on those keywords. ES6 also allows those keywords but they have a special use in Webpack.
React and JSX

React is a DOM Model manipulator. It supports a language called JSX which looks a lot like HTML. JSX is compiled to JavaScript. Babel's React plugin does this.
All together,
1. JS source files with EMCAScript 6 and JSX: Files
2. Files -> Babel toolchain.
3. Babel Toolchain: JSX Compilation, then EMCA 6 to 5, then Webpack
4. Output a runnable bundle.

NPM - Node Package Manager

A lot like python's pip. Instead of installing to global run-time libs, it installs to a local folder. (Looks like you can still install global lib). NPM also provides an interface for Node. package.json is an NPM config file and contains macros/scripts config. The scripts entry in the package.json contains npm macros that can be called with npm <script-key-name>. E.g.
"scripts":{
    "start": "node server.js"
}

will execute node server.js when you call npm start.
NPM can download packages like pip does for python. npm install <name> will download the package and put it in the node-modules folder of the project folder. The --save command means that the package will be added to the package.json's dependencies list. This allows us to take only your source code, and the package.json file, then run npm install to install the dependencies listed. This is useful for storing only your code on a repo, then when you clone the repo, npm install will get the dependencies for your repo.
ES6

As stated above import and export are used to import and export names from js files. They work pretty straightforward in node.js. Just name the file. For whatever reason, the buzzworders never mention how it works in browsers... which you obviously can't just name a file import. I finally found someone talking about it! You say type="module" in the <script> tags. Then
Postman

Postman is a convenient interface for sending HTTP commands.
Prevent Duplicate Submissions

Care should be taken when accepting submissions from users. If a button is clicked twice, the user may cause the server to perform the action twice. Thus, the server should make efforts to verify a request is unique. One way to do this is to include a unique ID in the request that can only be changed upon response from the server.
E.g. A hidden form field with request_id is submitted to the server. When the server sees the request_id for the first time, it records that that id has been or is being processed. If it sees that request_id again, it simply ignores the request. When the server is done processing, the server returns a new request Id which can then be used for further interaction.
CORS - Cross Origin Resource Sharing

Say we connect to www.google.com and then the JavaScript running on that webpage tries to connect to another website, say www.facebook.com, unless the response from www.facebook.com includes a allow CORS header (it has a special form), the web browser will stop and throw a CORS error. Good answer on CORS by the Ullrich guy https://security.stackexchange.com/questions/108835/how-does-cors-prevent-xss
CORS is not a security measure. It is a mechanism to work with an existing security mechanism in web browsers. The mechanism for the browser is to automatically reject data from 'cross origin' requests - this prevents CSRF, see below. If the browser didn't prevent this, a cross-origin request could be made to ask for sensitive user information from another site, and the malicious javascript would get that sensitive information. CORS exists to allow cross origin requests without introducing more security issues. A server that includes a CORS header is basically an admission that, requests on that server are not harmful and cannot be abused - or least they shouldn't be.
CSRF - Cross Site Request Forgery

Say we connect to www.my-bank.com and sign in. That means that we have a signed in session cookie stored on our computer and can use that session cookie to interact with the bank's site. CSRF is when we go to another site, and that site has malicious javascript on it that tells your computer to send a request to www.my-bank.com (you downloaded the javascript with the webpage - this is normal). The intent of this request is to cause side effects, say transfer money, from the user of www.my-bank.com. If you hadn't previously logged in on that site, the bank would reject the request because your browser does not have a valid session cookie. BUT since you logged in recently, and you might still be logged in (i.e. you still have a valid session), the bank will honor the request because your computer sent a 'valid' request because your browser will include the valid session token in the request automatically. The malicious javascript would then cause side effects that the user did not want, such as transfering money. (although most browsers will actually block the response before handing it to javascript, see CORS) Secure sites have defenses against this.
CSRF Token are a way to protect against csrf. CSRF Tokens are served with each page so that, when a request is made, the CSRF token must be present in the request. This way, malicious javascript on another site could attempt to send a request to the server, but the server will not honor it because the malicious javascript will not have a csrf.
Below I was confused as to why an AJAX request was an appropriate way to serve CSRF tokens because the browser would include the session token in the AJAX request, therefore the server would return a valid CSRF token, which the attacker could then use to perform the request they wanted. This doesn't work because of Same-Origin Policy. The browser will perform the AJAX request, and the server will honor it, but the browser will not allow the javascript to read the response (unless CORS is enabled on the server's response, which is SHOULD NOT BE).
https://stackoverflow.com/questions/20504846/why-is-it-common-to-put-csrf-prevention-tokens-in-cookies
CSRF Tokens can be stored in a cookie because only Javascript served on that domain can read them. The server, HOWEVER, should NOT validate CSRF via that cookie because the browser will automatically serve that CSRF token with each request to that domain - malicious javascript can make that request. The good Javascript CAN read that cookie, and when making a request, should include the CSRF token in the request somewhere other than the cookie header.
XSS - Cross Site Scripting

Why is it called XSS? (XSS is confusing because it often doesn't include another website)
https://security.stackexchange.com/questions/135545/why-is-it-called-cross-site-scripting-xss
XSS is when a website reflects user input onto the page without sanitizing the input.
E.g. If a user submits a comment to a page, that the page will then display, the user could include javascript <script>alert('something malicious')</script>. Now, since the website presents the page with that string, browsers will interpret that string as valid javascript and execute it.
Stored XSS is when the website presents a page with user input that is stored on the server.
An example of non-stored XSS, and why it was coined XSS is when, for example, the XSS is enabled through URL parameters.
E.g. A website presents a page that displays a string stored in the URL query, www.website.com?insecurefield=%3Cscript%3Ealert%28%27something%20malicious%27%29%3C%2Fscript%3E. When the website's page sees %3Cscript%3Ealert%28%27something%20malicious%27%29%3C%2Fscript%3E, it decodes it <script>alert('something malicious')</script>, then presents it with the page. The browser, again, will interpret that as valid javascript and execute it. Now, since malicious agent Bob knows about this insecurity, he can go to another website and trick users into clicking the link www.website.com?insecurefield=%3Cscript%3Ealert%28%27something%20malicious%27%29%3C%2Fscript%3E, then Bob's malicous Javascript can perform actions for that user on the insecure page - notice this dodges Same-Origin Policy because the user is actually on that page. Imagine if a banking platform was vulnerable. Users that click the link could find their bank accounts empty.
HTTP Session

The example below is what an HTTP session might look like. Notice that the sid is stored in a cookie.
HTTP/1.1 200 OK
via: insertA
Date: Thu, 31 Aug 2017 22:06:05 GMT
Vary: Accept-Encoding
Server: Apache-Coyote/1.1
Cookie: connect.sid=s%3AUnBWGdX0GVGSnXc0hZ1TqyPsvsAt0sI4

A session is often used to persist state between webpages. For example, logging into a website will produce a session-id that the browser will send to the website and the website will use to serve data that only a logged in user could see - consecutive site visits will see that session id and indicate that the logged-in user is visting the site. Note that when visiting secure sites, only HTTPS should be used because otherwise an attacker could get that session-id and use it to send malicious commands to the website and the website will think it was the user. HTTPS means that the session ID will always be encrypted, so an attacker couldn't get it.
HTTP Connection Header

HTTP RFC 2068 section 19.7.1 details persistent connections and the meaning of Connection: Keep-Alive. Generally it means that the TCP connection will stay open after the transaction is complete.
The length in which a connection is kept alive is determine by the implementation of the client and the server. One will decide to terminate the connection after some amount of time of inactivity (or should at least). See here.
TLS

TLS 1.2 RFC 5246. Section 7.3 contains a good overview of the TLS handshake. Section 7.4 is a more detailed explanation of TLS Handshake.
mTLS or mTLS Certificate is sometimes used to refer to the certificate required for the client to send to the server during the TLS handshake.
ECC. Good description here http://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/. Also check out her other stuff.
DHE is a key exchange algorithm that stands for Diffie-Hellman Ephemeral (as opposed to DH). You can see this in TLS cipher suites. Ephemeral means that the shared secret is discarded after use.
https://security.stackexchange.com/questions/120140/what-is-the-difference-between-dh-and-dhe
RSA. Decent (not great) RSA example https://brilliant.org/wiki/rsa-encryption/
Internet Protocol

BGP is the protocol used to discover neighboring routers and IP paths.
OAuth2

https://stackoverflow.com/questions/35985551/how-does-csrf-work-without-state-parameter-in-oauth2-0
Parts of the URL

URI Syntax RFC 3986
https://developer.mozilla.org/en-US/docs/Learn/Common_questions/What_is_a_URL
Breaking it down..
http://matt@www.example.com:80/path/to/myfile.html?key1=value1&key2=value2#SomewhereInTheDocument
http   ->   The protocol (others include ftp, https)
matt   ->   The authority (e.g. The server can use this to identify a user request)
www.example.com    ->   The domain name. (Your computer uses DNS servers to correlate an IP address)
:80   ->   The port number. (This is optional. Some protocols have Default ports; for http is 80, https is 443)
/path/to/myfile.html    ->    Path to 'file'. Sometimes called a route. Represents an endpoint on the server.
?key1=value1&key2=value2    ->    Parameters. Sometimes called the query. Denoted by ?
#SomewhereInTheDocument    ->    The anchor. Sometimes called the Fragment. Denoted by #. NEVER SENT TO SERVER. Used to link the browser to jump to a part of the page.

Anatomy of HTTP Request

Good read for Node.js https://nodejs.org/en/docs/guides/anatomy-of-an-http-transaction/
https://www.ntu.edu.sg/home/ehchua/programming/webprogramming/HTTP_Basics.html
HTTP 1.1 RFC 7230
HTTP REQUEST

GET /docs/index.html HTTP/1.1    ->   REQUEST LINE := method SP request-target SP HTTP-version CLRF
Host: www.nowhere123.com    -> This and below are HEADER FIELDS.
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
(blank line)

HTTP RESPONSE

HTTP/1.1 200 OK    ->    STATUS LINE
Date: Sun, 18 Oct 2009 08:56:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT
ETag: "10000000565a5-2c-3e94b66c2e680"
Accept-Ranges: bytes
Content-Length: 44
Connection: close
Content-Type: text/html
X-Pad: avoid browser bug

<html><body><h1>It works!</h1></body></html>

List of headers at this time https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers.
The Content-Type header indicates what kind of parser might be used on the body of the request.
https://tools.ietf.org/html/rfc7230#section-3.3 The presence of a message body is signaled by the Content-Length header.
Message syntax from RFC
 HTTP-message   = start-line
                  *( header-field CRLF )
                  CRLF
                  [ message-body ]

Notice that there are two line feeds between the last header and the message body. This is because headers are required to end in a line feed. This syntax applies to both request and response.
Set-Cookie Header

Interesting tidbit https://tools.ietf.org/html/rfc7230#section-3.2.2
Since it cannot be combined into a
single field-value, recipients ought to handle "Set-Cookie" as a
special case while processing header fields. 

See the next section.
Cookies

Cookies are bits of data that the server sends to the browser. The browser may store the cookie and send it back upon further interactions with the server. Cookies have been developed throughout the last 20 years or so, and the current spec is https://tools.ietf.org/html/rfc6265.
The Set-Cookie header is specified in section 4.1 https://tools.ietf.org/html/rfc6265#section-4.1.
The Cookie response header is specified in section 5.2
Note the important attributes specified, Expires, Max-Age, Domain, Path, Secure, HttpOnly. Each of these attributes are defined in the rfc.
The Domain attribute specifies the domain that he cookie should be sent when making requests on that domain. For example, if the value of the Domain attribute is "example.com", the user agent (browser) will include the cookie in the Cookie header when making HTTP requests to example.com, www.example.com, and www.corp.example.com.
Cookies are handled automatically by the browser. Client side javascript can be used to read cookies from the domain that the javascript was downloaded from. Javascript cannot be used to read cookies from other domains, the browser does not allow it (part of Same-Origin Policy).
The httpOnly flag for cookies indicates that the browser will not expose the cookie to Javascript.
Javascript Document Model

<element>.addEventListener('<event>', function(){}); is the syntax for added events listeners to HTML elements. See on-event handlers and the generalized addEventListener.
PHP

PHP is a hypertext processor. It is a specialized language for handling hypertext (it seems). It is a server-side scripting language.
https://www.w3schools.com/Php/php_intro.asp
CSS

https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS/How_CSS_works
CSS Declaration := <property> ":" <value>
<property> := Human-readable identifiers that indicate which stylistic features
    (e.g. font, width, background color) you want to change.
<value> := Each specified property is given a value, which indicates how you want to change those stylistic features 
    (e.g. what you want to change the font, width or background color to.)

CSS Declaration Block :=
"{"
    CSS Declaration ";" 
    [CSS Declaration ";"]
"}"

CSS Selector := Specialized string that identifies elements. E.g. ".<class>" indicates it applies to <class>
    "#<id>" indicates is applies to an Id.
    "<tag>" indicates it applies to an html tag.
    
CSS Rule := CSS Selector CSS Declaration Block

See here for https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS/Syntax at-rules.
Attribute selectors here https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS/Attribute_selectors
More on selectors here https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS/Selectors.
Psuedo classes here https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS/Pseudo-classes_and_pseudo-elements.
Psuedo classes are denoted by a selector followed by ":<pseudoclass". This means the the CSS declaration block only applies to elements of a particular class in a particular state. Also take a look at pseudo elements above.
Security And General Good To Know

https://softwareengineering.stackexchange.com/questions/46716/what-technical-details-should-a-programmer-of-a-web-application-consider-before
Multiple Websites on a single IP

Multiple websites can be hosted on the same port and IP address. This is accomplished by the Host: <Hostname> header.
GET /docs/index.html HTTP/1.1   
Host: www.nowhere123.com    -> This is the HOST header. Allows multiple websites on a single IP.
(blank line)

Multiple websites per IP is accomplished at the HTTP level. DNS will resolve multiple websites to a single IP but only HTTP can indicate which website you actually want.