The real, complex network environment consists of an ever-increasingly diverse and large amount of data encapsulated in packets. Surveillance and monitoring of this traffic is a necessary task for law enforcement, cybersecurity, and intelligence agencies. Intercepted network traffic must be classified into multiple categories, such as the protocol encapsulation layers contained, application it originates from, user generating the traffic, and the traffic's malicious or benign nature. There is a lack of solutions which are able to classify packets individually without flow-based features. In order to address the gaps in current traffic classification and DPI techniques, we propose the initial release of the Forager toolkit, a software consisting of tools to extract hidden representations from individual packets and use these features in deep learning models to perform traffic classification. It uses data mining techniques to perform automatic generation of regular expression signatures, locality-sensitive hash fingerprints, and matrix and point cloud representations of packets. These are used as input features for corresponding deep learning models which can perform traffic classification on single packets in a real system. The models are multi-modal to capture multiple angles and dimensions of features for increased complexity of classification problems. They can be run in parallel for optimal throughput and scalability. Our experiments use these models in multiple configurations and scenarios to demonstrate superior performance and classification capability to advance the state of the art in complex network traffic surveillance and hidden representation learning.