Coldsp33d/read_clipboard beginner's guide [DRAFT]

## read_clipboard beginner's guide [DRAFT]
## Beginner's Guide to `pd.read_clipboard`

[`read_clipboard`](http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#id45) is truly a saving grace for anyone starting out to answer questions in the [tag:pandas] tag. Unfortunately, pandas veterans also know that the data provided in questions isn't always easy to grok into a terminal due to various complication  such as MultiIndexes, spaces in header names, datetimes, and python objects.

Thankfully, `read_clipboard` has arguments that make handling most of these cases possible (and easy). The purpose of this answer is to document some of those cases in finer details.

---

### Spaces in column headers


---

### Read a Series instead of a DataFrame

---

### Python objects

Numeric data - simpler

String data - may need yaml

---

### Other considerations

Uses `read_csv` under the hood, so a lot of the principles for loading data from CSV apply here, such as

- parsing datetimes (use `parse_dates`)
- no headers (use `header=None`)
- custom names (use `names=[...]`)
- set a column as the index (use `index_col=[...]`)
- read series instead of DataFrame (use `squeeze=true`)
- specify a custom separator (use `sep='...'`. If multicharacter or regex, use `engine='python'`)


And so on. See [here](https://stackoverflow.com/a/56231664/4909087) for a more comprehensive list.

---

### Limitations of `read_clipboard`

- Cannot parse prettytable/tabulate output (IOW, borders make it harder). Check out some homemade attempts at tackling this.

- Cannot ignore ellipses in data (you'll need to manually remove them)
- Cannot load data from images (if you're upto the task you can make a tesseract extension that does)
-


---

### Other useful `pd.read_clipboard` questions for unconventionally formatted data
	## Beginner's Guide to `pd.read_clipboard`

	[`read_clipboard`](http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#id45) is truly a saving grace for anyone starting out to answer questions in the [tag:pandas] tag. Unfortunately, pandas veterans also know that the data provided in questions isn't always easy to grok into a terminal due to various complication such as MultiIndexes, spaces in header names, datetimes, and python objects.

	Thankfully, `read_clipboard` has arguments that make handling most of these cases possible (and easy). The purpose of this answer is to document some of those cases in finer details.

	---

	### Spaces in column headers


	---

	### Read a Series instead of a DataFrame

	---

	### Python objects

	Numeric data - simpler

	String data - may need yaml

	---

	### Other considerations

	Uses `read_csv` under the hood, so a lot of the principles for loading data from CSV apply here, such as

	- parsing datetimes (use `parse_dates`)
	- no headers (use `header=None`)
	- custom names (use `names=[...]`)
	- set a column as the index (use `index_col=[...]`)
	- read series instead of DataFrame (use `squeeze=true`)
	- specify a custom separator (use `sep='...'`. If multicharacter or regex, use `engine='python'`)


	And so on. See [here](https://stackoverflow.com/a/56231664/4909087) for a more comprehensive list.

	---

	### Limitations of `read_clipboard`

	- Cannot parse prettytable/tabulate output (IOW, borders make it harder). Check out some homemade attempts at tackling this.

	- Cannot ignore ellipses in data (you'll need to manually remove them)
	- Cannot load data from images (if you're upto the task you can make a tesseract extension that does)
	-


	---

	### Other useful `pd.read_clipboard` questions for unconventionally formatted data