Skip to content

Instantly share code, notes, and snippets.

@jborean93
Last active April 8, 2024 06:42
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jborean93/255a104f69a791868e92ac5d37155963 to your computer and use it in GitHub Desktop.
Save jborean93/255a104f69a791868e92ac5d37155963 to your computer and use it in GitHub Desktop.
Reads a file on an SMB share
# Copyright: (c) 2019, Jordan Borean (@jborean93) <jborean93@gmail.com>
# MIT License (see LICENSE or https://opensource.org/licenses/MIT)
import uuid
from contextlib import contextmanager
from io import BytesIO
from smbprotocol.connection import Connection
from smbprotocol.session import Session
from smbprotocol.open import CreateDisposition, FileAttributes, FilePipePrinterAccessMask, ImpersonationLevel, Open, \
ShareAccess
from smbprotocol.tree import TreeConnect
@contextmanager
def smb_b_open(path, mode='r', share='r', username=None, password=None, encrypt=True):
"""
Functions similar to the builtin open() method where it will create an open handle to a file over SMB. This can be
used to read and/or write data to the file using the methods exposed by the Open() class in smbprotocol. Read and
write operations only support bytes and not text strings.
:param path: The full UNC path to the file to open. This should be '\\server\share\path.txt'.
:param mode: The mode in which the file is to be opened, can be set to one of the following;
'r': Opens the file for reading (default)
'w': Opens the file for writing, truncating first
'x': Create a new file and open it for writing, fail if the file already exists
:param share: The SMB sharing mode to set for the opened file handle, can be set to one or more of the following:
'r': Allows other handles to read from the file (default)
'w': Allows other handles to write to the file
'd': Allows other handles to delete the file
:param username: Optional username to use for authentication, required if Kerberos is not used.
:param password: Optional password to use for authentication, required if Kerberos is not used.
:param enrypt: Whether to use encryption or not, Must be set to False if using an older SMB Dialect.
:return: The opened smbprotocol Open() obj that has a read, write, and flush functions.
"""
path_split = [e for e in path.split('\\') if e]
if len(path_split) < 3:
raise ValueError("Path should specify at least the server, share, and file to read.")
server = path_split[0]
share_name = path_split[1]
file_path = "\\".join(path_split[2:])
conn = Connection(uuid.uuid4(), server)
conn.connect()
try:
session = Session(conn, username=username, password=password, require_encryption=encrypt)
session.connect()
try:
tree = TreeConnect(session, r"\\%s\%s" % (server, share_name))
tree.connect()
try:
if mode == 'r':
create_disposition = CreateDisposition.FILE_OPEN
access_mask = FilePipePrinterAccessMask.GENERIC_READ
elif mode == 'w':
create_disposition = CreateDisposition.FILE_OVERWRITE_IF
access_mask = FilePipePrinterAccessMask.GENERIC_WRITE
elif mode == 'x':
create_disposition = CreateDisposition.FILE_CREATE
access_mask = FilePipePrinterAccessMask.GENERIC_WRITE
else:
raise ValueError("Invalid mode value specified.")
share_map = {
'r': ShareAccess.FILE_SHARE_READ,
'w': ShareAccess.FILE_SHARE_WRITE,
'd': ShareAccess.FILE_SHARE_DELETE,
}
share_access = 0
for s in share:
share_access |= share_map[s]
obj = Open(tree, file_path)
obj.create(
ImpersonationLevel.Impersonation,
access_mask,
FileAttributes.FILE_ATTRIBUTE_NORMAL,
share_access,
create_disposition,
0,
)
try:
yield obj
finally:
obj.close()
finally:
tree.disconnect()
finally:
session.disconnect()
finally:
conn.disconnect()
# Reads a file and store it in a BytesIO buffer.
b_io = BytesIO()
with smb_b_open(r'\\server.domain.local\c$\temp\file.xlsx') as file_obj:
offset = 0
while offset < file_obj.end_of_file:
length = file_obj.connection.max_read_size
data = file_obj.read(offset, length)
b_io.write(data)
offset += length
b_io.seek(0)
@jp495
Copy link

jp495 commented Jul 9, 2019

Thanks a ton for this! I'm running into credit issues -- even just to read a file <64 kilobytes. If I add conn.echo(sid=session.session_id, timeout=60, credit_request=33), I increase the credits available to 65 (from 33), but I can't seem to request more beyond that.

I'm definitely limited by my understanding of SMB/credits here.

Traceback (most recent call last):
  File "/Users/XXX/Documents/workspace/grs/dfst_etl/dfst_etl/smb_scrape.py", line 116, in <module>
    data = file_obj.read(offset, length)
  File "/Users/XXX/Documents/workspace/grs/dfst_etl/venv/lib/python3.7/site-packages/smbprotocol/open.py", line 1100, in read
    self.tree_connect.tree_connect_id)
  File "/Users/XXX/Documents/workspace/grs/dfst_etl/venv/lib/python3.7/site-packages/smbprotocol/connection.py", line 922, in send
    credit_request)
  File "/Users/XXX/Documents/workspace/grs/dfst_etl/venv/lib/python3.7/site-packages/smbprotocol/connection.py", line 1093, in _generate_packet_header
    raise smbprotocol.exceptions.SMBException(error_msg)
smbprotocol.exceptions.SMBException: Request requires 128 credits but only 33 credits are available

@jborean93
Copy link
Author

Sigh I've mostly forgotten about credit charges so my advice may not be optimal or there may be a potential bug in the code I should fix:(. Any message that is, or has the potential to exceed over 64 kilobytes requires a credit charge which is calculated by:

CreditCharge = (max(SendPayloadSize, Expected ResponsePayloadSize) – 1) / 65536 + 1

In this case the request requires 128 credits but the server can only give out 33 credits. Whether this is some configuration policy on the server end I'm not 100% sure but for you case I would reduce the length of each read operation. You're best bet to get this working is to change the read code to something like:

offset = 0
buffer = 65536
    while offset < file_obj.end_of_file:
        data = file_obj.read(offset, buffer)
        b_io.write(data)
        offset += buffer

This will limit each read request to 64kb and for larger files would mean more roundtrips to and from the server. Looks like you do have some credits available to play with so you could potentially increase it to a higher number to reduce the number of roundtrips. Sorry I don't have a better solution but hopefully it gets you going. If you find any more info that might be useful then please let me know.

@jp495
Copy link

jp495 commented Jul 9, 2019

Did some digging and tweaking. I'm not sure why that while loop requires 128 credits regardless of the actual size of the file.

file_obj.read(0, 4259840) works up to (roughly) the theoretical limit of 4,259,840 bytes (65 credits * 64KB per credit), but some of the Excel files I'm reading are approaching double that size. I can hack together a way to split that up based on the size of the file, but I don't imagine it would be as elegant or efficient as your while loop. I'm just not sure why it requires so many credits.

edit: I was writing this as you wrote your reply above and didn't see it until I posted the comment. I'll try what you mentioned above. Thanks again!

@jborean93
Copy link
Author

No worries, SMB traditionally had a limit of 64kb per request and I believe SMB 2 or 2.1 added the ability for a larger MTU which is governed by credits given out by the server. The smbprotocol code will request the number of credits required for each request and update the header accordingly. It's up to the server to either grant the credit request or reject it. Your case seems to be the latter so you would need to reduce the size of the read length. As to what's governing the logic on the server end to accept or reject the credit request I'm not entirely sure how/if it is configurable.

@jp495
Copy link

jp495 commented Jul 9, 2019

Gotcha. In the original gist (line 101), why would length parameter be the max_read_size of the file object rather than just the length/size of the file object? Maybe I misunderstand what max_read_size is -- I couldn't find a good explanation for it in either your documentation or the MS-SMB2 docs.

Implementing that fix and setting the buffer to its max (4,259,840) requires 26.7 seconds to read that 6.8MB Excel file, or 66 seconds with a buffer of 65536, if you were curious. Not ideal performance, but I don't see convenient way to speed that up.

Related -- is there any easy way up front to check the default credit allocation? I know echo returns how many extra credits I successfully lobby for, but I can't find how to check the base amount.

@jborean93
Copy link
Author

why would length parameter be the max_read_size of the file object rather than just the length/size of the file object?

Accoding to the SMB2 NEGOTIATE Response message, the MaxReadSize is:

The maximum size, in bytes, of the Length in an SMB2 READ Request (section 2.2.19) that the server will accept.

So if the length of the file exceeds the maximum read size then it will fail whereas we are doing it in chunks to make sure we don't exceed this value.

Not ideal performance, but I don't see convenient way to speed that up.

Yep I would need to investigate more into how credits are actually granted to see if there was a better way to dynamically do this. Keeping in mind this is all written in Python so performance will never be as good as something written in C or some other lower level language.

Related -- is there any easy way up front to check the default credit allocation?

Currently the client does not keep track of the credits allocated to it, AFAIK you shouldn't need to but I could be wrong. It's been too long since I've played around with it so maybe there is a reason why I should.

@jp495
Copy link

jp495 commented Jul 9, 2019

Okay, that makes sense. Weird that the max read size is so large (in my case, 8,388,608B) compared to the credits it allocates to clients.

Only reason I would want to keep track of credit allocation is so I can dynamically scale the buffer size (as a function of my credit allocation) in case one day I find myself being allocated less credits. Not a particularly big deal, and something I can mitigate in other ways.

@jborean93
Copy link
Author

In general you can keep track of the credits yourself, smbprotocol will only request credits based on it's consumption so unless you've manually requested more using an echo command it should always be at 1.

@hdm0511
Copy link

hdm0511 commented Sep 30, 2022

Hi @jborean93,
I am trying to use the smbprotocol and smbclient. However, I can't connect to the samba server

Traceback (most recent call last):
File "smb_get_file.py", line 16, in
get_file()
File "smb_get_file.py", line 6, in get_file
smbclient.register_session('100.190.110.103', username='unix', password='unix')
File "/home/unix/.local/lib/python3.8/site-packages/smbclient/_pool.py", line 374, in register_session
connection.connect(timeout=connection_timeout)
File "/home/unix/.local/lib/python3.8/site-packages/smbprotocol/connection.py", line 799, in connect
smb_response = self._send_smb2_negotiate(dialect, timeout, enc_algos, sign_algos)
File "/home/unix/.local/lib/python3.8/site-packages/smbprotocol/connection.py", line 1497, in _send_smb2_negotiate
response = self.receive(request, timeout=timeout)
File "/home/unix/.local/lib/python3.8/site-packages/smbprotocol/connection.py", line 925, in receive
raise SMBException("Connection timeout of %d seconds exceeded while waiting for a message id %s "
smbprotocol.exceptions.SMBException: Connection timeout of 60 seconds exceeded while waiting for a message id 0 response from the server


`import smbclient

def get_file():
smbclient.register_session('100.190.110.103', username='unix', password='unix')
smbclient.rmdir(r"\100.190.110.103\shared\new")

with smbclient.open_file(r"\\10.19.11.13\shared\new\file.txt", mode="w") as fd:
    fd.write(u"file contents")

if name == 'main':
get_file()`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment